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VIRAL MATERIAL AND NUCLEOTIDE FRAGMENTS ASSOCIATED WITH 
MULTIPLE SCLEROSIS, FOR DIAGNOSTIC, PROPHYLACTIC AND 
THERAPEUTIC PURPOSES 

5 Multiple sclerosis (MS) is a demyelinating 

disease of the central nervous system (CNS) the cause of 
which remains as yet unknown. 

"Multiple sclerosis (MS) is the most common 
neurological disease of young adults with a prevalence in 

10 Europe and North America of between 20 and 200 per 
100,000, It is characterized clinically by a 
relapsing/remitting or chronic progressive course, 
frequently leading to severe disability. Current knowledge 
suggests that MS is associated with autoimmunity, that 

15 genetic background has an important influence and that 
"infectious" agent (s) may be involved. Indeed, many 
viruses have been proposed as possible candidates but as 
yet, none of them has been shown to play an aetiological 
role. 

20 Many studies have supported the hypothesis of a 

viral aetiology of the disease, but none of the known 
viruses tested has proved to be the causal agent sought: a 
review of the viruses sought for several years in MS has 
been compiled by E. Norrby (1) and R.T. Johnson (2) . 

25 The discovery of pathogenic retroviruses in man 

(HTLVs and HIVs) was followed by great interest in their 
ability to impair the immune system and to provoke central 
nervous system inflammation and/or degeneration. In the 
case of HTLV-1, its association with a chronic 

30 inflammatory demyelinating disease in man (48) led to 
extensive investigations to search for an HTLVl-like 
retrovirus in MS patients. However, despite initial 
claims, the presence of HTLV-l or HTLV-like retroviruses 
was not confirmed. 
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Recently , a retrovirus different from the known 
human retroviruses has been isolated in patients suffering 
from MS (3, 4, and 5) . 

In 1989, the authors described the production of 
5 extracellular virions, associated with reverse 
transcriptase (RT) activity, by a culture of 
leptomeningeal cells (LM7) obtained from the cerebrospinal 
fluid of a patient with MS (3). This was followed by 
similar findings in monocyte cultures from a series of MS 
patients (5) . Neither viral particles nor viral RT- 
activity were found in control individuals. Furthermore, 
the authors were able to transfer the LM7 virus to non- 
infected leptomeningeal cells in vitro (26) . The molecular 
characterization of the "LM7" retrovirus was a 
prerequisite for further evaluation of its possible role 
in MS. Considerable difficulties arose from the absence of 
continuously productive retroviral cultures and from the 
low levels of expression in the few transient cultures. 
The strategy described here focused on RNA from 
extracellular virions, in order to avoid non-specific 
detection of cellular RNA and of endogenous elements from 
contaminating human DNA. A specific retroviral sequence 
associated with virions produced by cell cultures from 
several MS patients has been identified. The entire 
sequence of this novel retroviral genome is currently 
being obtained using RT-PCR on RNA from extracellular 
virions. The retrovirus previously called "LM7 virus" 
corresponds to an oncovirus and is now designated MSRV 
(Multiple Sclerosis-associated Retrovirus) . 

The authors were also able to show that this 
retrovirus could be transmitted in vitro, that patients 
suffering from MS produced antibodies capable of 
recognizing proteins associated with the infection of 
leptomeningeal cells by this retrovirus, and that the 
expression of the latter could be strongly stimulated by 
the immediate-early genes of some herpesviruses (6) . 
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All these results point to the role in MS of at 
least one unknown retrovirus or of a virus having reverse 
transcriptase activity which is detectable according to 
the method published by H. Perron (3) and qualified as 
5 "LM7-like RT" activity. The content of the publication 
identified by (3) is incorporated in the present 
description by reference. 

Recently, the Applicant's studies have enabled 
two continuous cell lines infected with natural isolates 

10 originating from two different patients suffering from MS 
to be obtained by a culture method as described in the 
document WO-A-93/20188 , the content of which is incorpor- 
ated in the present description by reference. These two 
lines, derived from human choroid plexus cells, designated 

15 LM7PC and PLI-2, were deposited with the ECACC on 
22nd July 1992 and 8th January 1993, respectively, under 
numbers 92072201 and 93010817, in accordance with the 
provisions of the Budapest Treaty. Moreover, the viral 
isolates possessing LM7-like RT activity were also 

20 deposited with the ECACC under the overall designation of 
"strains". The "strain" or isolate harboured by the PLI-2 
line, designated POL-2, was deposited with the ECACC on 
22nd July 1992 under No. V92072202. The "strain" or 
isolate harboured by the LM7PC line, designated MS7PG, was 

25 deposited with the ECACC on 8th January 1993 under 
No. V93010816. 

Starting from the cultures and isolates 
mentioned above, characterized by biological and 
morphological criteria, the next step was to endeavour to 

30 characterize the nucleic acid material associated with the 
viral particles produced in these cultures. 

The portions of the genome which have already 
been characterized have been used to develop tests for 
molecular detection of the viral genome and 

35 immunoserological tests, using the amino acid sequences 
encoded by the nucleotide sequences of the viral genome, 
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in order to detect the immune response directed against 
epitopes associated with the infection and/or viral 
expression. 

These tools have already enabled an association 
5 to be confirmed between MS and the expression of the 
sequences identified in the patents cited later. However, 
the viral system discovered by the Applicant is related to 
a complex retroviral system. In effect, the sequences to 
be found encapsidated in the extracellular viral particles 

10 produced by the different cultures of cells of patients 
suffering from MS show clearly that there is 
coencapsidation of retroviral genomes which are related 
but different from the "wild-type" retroviral genome which 
produces the infective viral particles. This phenomenon 

15 has been observed between replicative retroviruses and 
endogenous retroviruses belonging to the same family, or 
even heterologous retroviruses. The notion of endogenous 
retroviruses is very important in the context of our 
discovery since, in the case of MSRV-l, it has been 

20 observed that endogenous retroviral sequences comprising 
sequences homologous to the MSRV-l genome exist in normal 
human DNA. The existence of endogenous retroviral elements 
(ERV) related to MSRV-l by all or part of their genome 
explains the fact that the expression of the MSRV-l 

25 retrovirus in human cells is able to interact with closely 
related endogenous sequences. These interactions are to be 
found in the case of pathogenic and/ or infectious 
endogenous retroviruses (for example some ecotropic 
strains of the murine leukaemia virus) , and in the case of 

30 exogenous retroviruses whose nucleotide sequence may be 
found partially or wholly, in the form of ERVs, in the 
host animal's genome (e.g. mouse exogenous mammary tumor 
virus transmitted via the milk) . These interactions 
consist mainly of (i) a trans-activation or coactivation 

35 of ERVs by the replicative retrovirus (ii) and 
"illegitimate" encapsidation of RNAs related to ERVS, or 
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of ERVs - or even of cellular RNAs - simply possessing 
compatible encapsidation sequences, in the retroviral 
particles produced by the expression of the replicative 
strain , which are sometimes transmissible and sometimes 
with a pathogenicity of their own, and (iii) more or less 
substantial recombinations between the coencapsidated 
genomes, in particular in the phases of reverse 
transcription, which lead to the formation of hybrid 
genomes, which are sometimes transmissible and sometimes 
with a pathogenicity of their own. 

Thus, (i) different sequences related to MSRV-1 
have been found in the purified viral particles; (ii) 
molecular analysis of the different regions of the MSRV-1 
retroviral genome should be carried out by systematically 
analyzing the coencapsidated, interfering and/or 
recombined sequences which are generated by the infection 
and/or expression of MSRV-1; furthermore, some clones may 
have defective sequence portions produced by the 
retroviral replication and template errors and/or errors 
of transcription of the reverse transcriptase; (iii) the 
families of sequences related to the same retroviral 
genomic region provide the means for an overall diagnostic 
detection which may be optimized by the identification of 
invariable regions among the clones expressed, and by the 
identification of reading frames responsible for the 
production of antigenic and/or pathogenic polypeptides 
which may be produced only by a portion, or even by just 
one, of the clones expressed, and, under these conditions, 
the systematic analysis of the clones expressed in the 
region of a given gene enables the frequency of variation 
and/or of recombination of the MSRV-1 genome in this 
region to be evaluated and the optimal sequences for the 
applications, in particular diagnostic applications, to be 
defined; (iv) the pathology caused by a retrovirus such as 
MSRV-1 may be a direct effect of its expression and of the 
proteins or peptides produced as a result thereof, but 
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also an effect of the activation, the encapsidation or the 
recombination of related or heterologous genomes and of 
the proteins or peptides produced as a result thereof; 
thus, these genomes associated with the expression of 
5 and/or infection by MSRV-1 are an integral part of the 
potential pathogenicity of this virus, and hence 
constitute means of diagnostic detection and special 
therapeutic targets. Similarly, any agent associated with 
or cofactor of these interactions responsible for the 

10 pathogenesis in question, such as MSRV-2 or the gliotoxic 
factor which are described in the patent application 
published under No. FR-2,716,198, may participate in the 
development of an overall and very effective strategy for 
the diagnosis, prognosis, therapeutic monitoring and/or 

15 integrated therapy of MS in particular, but also of any 
other disease associated with the same agents. 

In this context, a parallel discovery has been 
made in another autoimmune disease, rheumatoid arthritis 
(RA) , which has been described in the French Patent 

20 Application filed under No. 95/02960. This discovery shows 
that, by applying methodological approaches similar to the 
ones which were used in the Applicant's work on MS, it was 
possible to identify a retrovirus expressed in RA which 
shares the sequences described for MSRV-1 in MS, and also 

25 the coexistence of an associated MSRV-2 sequence also 
described in MS. As regards MSRV-1, the sequences detected 
in common in MS and RA relate to the pol and gag genes. In 
the current state of knowledge, it is possible to 
associate the gag and pol sequences described with the 

30 MSRV-1 strains expressed in these two diseases. 

The present patent application relates to 
various results which are additional to those already 
protected by the following French Patent Applications: 
- No. 92/04322 of 03.04.1992, published under 

35 NO. 2,689,519; 
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03.11.1992, 
03.11.1992, 
04.02.1994, 
04.02.1994, 
04.02.1994, 
04.02.1994, 
24.11.1994, 

23.12.1994; 



published 
published 
published 
published 
published 
published 
published 
published 



under 



under 



under 



under 



under 



under 



under 



- No. 92/13447 of 03.11.1992, published under 
No. 2,689,521; 

- No. 92/13443 of 
No. 2,689,520; 

5 - No. 94/01529 of 
No. 2,715,936; 

- No. 94/01531 of 
No. 2,715,939; 

- No. 94/01530 of 
10 No. 2,715,936; 

- No. 94/01532 of 
No. 2,715,937; 

- No. 94/14322 of 
NO. 2,727,428; 

15 - and No. 94/15810 of 
No. 2,728,585. 

The present invention relates, in the first 
place, to a viral material, in the isolated or purified 
state, which may be recognized or characterized in 

20 different ways: 

- its genome comprises a nucleotide sequence chosen from 
the group including the sequences SEQ ID NO: 46, SEQ ID 
NO: 51, SEQ ID NO: 52, SEQ ID NO: 53, SEQ ID NO: 56, SEQ ID 
NO: 58, SEQ ID NO: 59, SEQ ID NO: 60, SEQ ID NO: 61, SEQ ID 

25 NO: 89, their complementary sequences and their equivalent 
sequences, in particular nucleotide sequences displaying, 
for any succession of 100 contiguous monomers, at least 
50% and preferably at least 70% homology with the said 
sequences SEQ ID NO:46, SEQ ID NO:51, SEQ ID NO:52, SEQ ID 

30 NO:53, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:59, SEQ ID 
NO: 60 SEQ ID NO: 61, SEQ ID NO: 89, respectively, and their 
complementary sequences; 

- the region of its genome comprising the env and pol 
genes and a portion of the gag gene, excluding the 

35 subregion having a sequence identical or equivalent to 
SEQ ID N0:1, codes for any polypeptide displaying, for any 
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contiguous succession of at least 30 amino acids, at least 
50% and preferably at least 70% homology with a peptide 
sequence encoded by any nucleotide sequence chosen from 
the group including SEQ ID NO: 46, SEQ ID NO: 51, SEQ ID 
5 NO:52, SEQ ID NO:53, SEQ ID NO:56, SEQ ID NO:58, SEQ ID 
NO: 59, SEQ ID NO: 60 SEQ ID NO: 61 SEQ ID NO: 89 and their 
complementary sequences; 

- the pol gene comprises a nucleotide sequence partially 
or totally identical or equivalent to SEQ ID NO: 57 or SEQ 
ID NO: 93, excluding SEQ ID N0:1. 

- the gag gene comprises a nucleotide sequence partially 
or totally identical or equivalent to SEQ ID NO: 88. 

As indicated above, according to the present 
invention, the viral material as defined above is 
associated with MS. And as defined by reference to the pol 
or gag gene of MSRV-1, and more especially to the 
sequences SEQ ID NOS 51, 56, 57, 59, 60, 61, 88, 89, 93, 
169, 170, 171, 172, 176, 177, 178 and 179, this viral 
material is associated with RA. 

The present invention also relates to a nucleic 
material, in the isolated or purified state, having at 
least one of the following definitions : 

- a nucleic material comprising a nucleotide sequence 
selected from the group including sequences SEQ ID NO: 93, 
SEQ ID NO:94, SEQ ID NO: 169, SEQ ID NO: 170, SEQ ID NO:171, 
SEQ ID NO: 172, SEQ ID NO: 176, SEQ ID NO: 177, 
SEQ ID NO: 178, SEQ ID NO: 179, their complementary 
sequences and their equivalent sequences, in particular 
nucleotide sequences displaying, for any succession of 100 
contiguous monomers, at least 50% and preferably at least 
60% homology with said sequences SEQ ID NO: 93, 
SEQ ID NO:94, SEQ ID NO:169, SEQ ID NO:170, SEQ ID NO:171, 
SEQ ID NO: 172, SEQ ID NO: 176, SEQ ID NO: 177, 
SEQ ID NO: 178, SEQ ID NO: 179, and their complementary 
sequences, excluding HSERV-9 (or ERV-9) ; advantageously, 
the nucleotide sequence of said nucleic material is 
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selected from the group including sequences SEQ ID NO: 93, 
SEQ ID NO: 94 , SEQ ID NO: 169, SEQ ID NO: 170 , SEQ ID NO: 171, 



5 sequences and their equivalent sequences, in particular 
nucleotide sequences displaying, for any succession of 100 
contiguous monomers, at least 70% and preferably at least 
80% homology with said sequences SEQ ID NO: 93, 
SEQ ID NO:94, SEQ ID N0:169, SEQ ID NO:170, SEQ ID NO:171, 
10 SEQ ID NO: 172, SEQ ID NO: 176, SEQ ID NO: 177, 

SEQ ID NO:178, SEQ ID NO:179, and their complementary 
sequences ; 

- a nucleic material, in the isolated or purified state, 
coding for any polypeptide displaying, for any contiguous 

15 succession of at least 30 amino acids, at least 50%, 
preferably at least 60 % ; and most preferably at least 70% 
homology with a peptide sequence encoded by any nucleotide 
sequence selected from the group including SEQ ID NO: 93, 
SEQ ID NO: 94, SEQ ID NO: 169, SEQ ID NO: 170, SEQ ID NO: 171, 

20 SEQ ID NO: 172, SEQ ID NO: 176, SEQ ID NO: 177, 

SEQ ID NO: 178, SEQ ID NO: 179 and their complementary 
sequences ; 

- a nucleic material, in the isolated or purified state, 
of retroviral type, comprising a nucleotide sequence 

25 identical or similar to at least part of the pol gene of 
an isolated retrovirus associated with multiple sclerosis 
or rheumatoid arthritis; advantageously, said nucleotide 
sequence is 80 % similar to said at least part of the gene 
pol; 

30 - a nucleic material comprising a nucleotide sequence 
identical or similar to at least part of the pol gen of an 
isolated virus encoding a reverse transcriptase having a 
enzymatic site comprised between the amino acid domains 
LPQG-YXDD, having a phylogenic distance with HSERV-9 of 

35 0.063 ± 0.1, and preferably 0.063 + 0.05; the phylogenic 
distances are calculated on the basis of a reference 



SEQ ID NO: 172, 
SEQ ID NO: 178, 



SEQ ID NO: 176, 
SEQ ID NO: 179, 



their 



SEQ ID NO: 177, 
comp 1 ement ary 
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sequence according to UPGM tree option of the Geneworks™ 
Software (INTELLIGENETICS) ; 

By enzymatic site, we understand the amino acids domain (s) 
conferring the specific activity of a given enzyme. 
5 The present invention also relates to different 

nucleotide fragments each comprising a nucleotide sequence 
chosen from the group including: 

(a) all the genomic sequences, partial and total, of the 
pol gene of the MSRV-1 virus, except for the total 

10 sequence of the nucleotide fragment defined by 
SEQ ID N0:l; 

(b) all the genomic sequences, partial and total, of the 
env gene of MSRV-1; 

(c) all the partial genomic sequences of the gag gene of 
15 MSRV-1; 

(d) all the genomic sequences overlapping the pol gene and 
the env gene of the MSRV-1 virus, and overlapping the pol 
gene and the gag gene; 

(e) all the sequences, partial and total, of a clone 
20 chosen from the group including the clones FBd3 

(SEQ ID NO:46), t pol (SEQ ID N0:51), JLBcl 

(SEQ ID NO:52), JLBc2 (SEQ ID NO:53) and GM3 

(SEQ ID NO:56), FBdl3 (SEQ ID N0:58), LB19 (SEQ ID NO:59), 

LTRGAG12 (SEQ ID NO:60), FP6 (SEQ ID NO:61), G+E+A 
25 (SEQ ID NO: 89), excluding any nucleotide sequence 

identical to or lying within the sequence defined by 

SEQ ID N0:1; 

(f) sequences complementary to the said genomic sequences; 

(g) sequences equivalent to the said sequences (a) to (e) , 
30 in particular nucleotide sequences displaying, for any 

succession of 100 contiguous monomers, at least 50% and 
preferably at least 70% homology with the said sequences 
(a) to (d), 

provided that this nucleotide fragment does not comprise 
35 or consist of the sequence ERV-9 as described in LA MANTIA 
et al. (18) . 
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The term genomic sequences, partial or total, 
includes all sequences associated by coencapsidation or by 
coexpression, or recombined sequences. 

Preferably, such a fragment comprises: 
5 - either a nucleotide sequence identical to a partial or 
total genomic sequence of the pol gene of the MSRV-1 
virus, except for the total sequence of the nucleotide 
fragment defined by SEQ ID N0:1, or identical to any 
sequence equivalent to the said partial or total genomic 
sequence, in particular one which is homologous to the 
latter; 

- or a nucleotide sequence identical to a partial or total 
genomic sequence of the env gene of the MSRV-1 virus, or 
identical to any sequence complementary to the said 
nucleotide sequence, or identical to any sequence 
equivalent to the said nucleotide sequence, in particular 
one which is homologous to the latter. 

In particular, the invention relates to a 
nucleotide fragment comprising a coding nucleotide 
sequence which is partially or totally identical to a 
nucleotide sequence chosen from the group including: 

- the nucleotide sequence defined by SEQ ID NO: 40, SEQ ID 
NO: 62 or SEQ ID NO: 89; 

- sequences complementary to SEQ ID NO: 40, SEQ ID NO: 62 or 
SEQ ID NO: 89; 

- sequences equivalent, and in particular homologous to 
SEQ ID NO: 40, SEQ ID NO: 62 or SEQ ID NO: 89; 

- sequences coding for all or part of the peptide sequence 
defined by SEQ ID NO: 39, SEQ ID NO: 63 or SEQ ID NO: 90; 

- sequences coding for all or part of a peptide sequence 
equivalent, in particular homologous to SEQ ID NO: 39, SEQ 
ID NO: 63 or SEQ ID NO: 90, which is capable of being 
recognized by sera of patients infected with the MSRV-1 
virus, or in whom the MSRV-1 virus has been reactivated. 
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The invention also relates to a nucleotide 
fragment (called fragment I) having at least one of the 
following definitions : 

- a nucleotide fragment comprising a nucleotide sequence 
5 selected from the group including SEQ ID NO: 93, 
SEQ ID NO:94, SEQ ID NO:169, SEQ ID NO:170, SEQ ID NO:171 f 
SEQ ID NO: 172 , SEQ ID NO: 17 6, SEQ ID NO: 177, 

SEQ ID NO: 178, SEQ ID NO: 179, their complementary 
sequences and their equivalent sequences, in particular 

10 nucleotide sequences displaying, for any succession of 100 
contiguous monomers, at least 50% and preferably at least 
60% homology with said sequences and their complementary 
sequences, said group excluding SEQ ID NO:l, 
said nucleotide fragment not comprising nor consisting of 

15 the sequence HSERV-9 (or ERV-9) ; preferably the nucleotide 
sequence of said fragment is selected from the group 
including SEQ ID NO:93, SEQ ID NO:94, SEQ ID NO:169, 
SEQ ID NO: 170, SEQ ID NO: 171, SEQ ID NO: 172, 

SEQ ID NO:176, SEQ ID N0:177, SEQ ID NO:178, 

20 SEQ ID NO: 179, their complementary sequences and their 
equivalent sequences, in particular nucleotide sequences 
displaying, for any succession of 100 contiguous monomers, 
at least 70% and preferably at s least 80% homology with 
said sequences and their complementary sequences; 

25 - a nucleotide fragment comprising a coding nucleotide 
sequence which is partially or totally identical to a 
nucleotide sequence selected from the group including : 

SEQ ID NO: 93, SEQ ID NO: 94, SEQ ID NO: 169, 

SEQ ID NO: 170, SEQ ID NO: 171, SEQ ID NO: 172, 

30 SEQ ID NO:176, SEQ ID NO:177, SEQ ID NO:178, 

SEQ ID NO: 179 ; their complementary sequences ; their 
equivalent sequences, in particular homologous to 
SEQ ID NO:93, SEQ ID NO:94, SEQ ID NO: 169, SEQ ID NO:170, 
SEQ ID NO:171, SEQ ID NO:172, SEQ ID NO:176, 

35 SEQ ID NO: 177, SEQ ID NO: 178, SEQ ID NO: 179; 
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sequences encoding all or parts of the peptide 
sequence defined by SEQ ID NO: 95, SEQ ID NO: 173, 
SEQ ID NO: 174, SEQ ID NO: 175, SEQ ID NO: 180, 

SEQ ID NO: 181, SEQ ID NO: 182; 
5 sequences encoding all or parts of a peptide 

sequence equivalent, in particular homologous to 
SEQ ID NO: 95, SEQ ID NO: 173, SEQ ID NO: 174, SEQ ID NO: 175, 
SEQ ID NO: 180, SEQ ID NO: 181, SEQ ID NO: 182, which is 
capable of being recognized by sera of patients infected 

10 with the MSRV-1 virus, or in whom the MSRV-l virus has 
been reactivated. 

The invention also relates to any nucleic acid 
probe for the detection of virus associated with MS and/or 
rheumatoid arthritis (RA) , which is capable of hybridizing 

15 specifically with any fragment such as is defined above, 
belonging or lying within the genome of the said 
pathogenic agent. It relates, in addition, to any nucleic 
acid probe for detection of a pathogenic and/or infective 
agent associated with RA, which is capable of hybridizing 

20 specifically with any fragment as defined above by 
reference to the pol and gag genes, and especially with 
respect to the sequences SEQ ID NOS 40, 51, 56, 59, 60, 
61, 62, 89 and SEQ ID NOS 39, 63 and 90. 

The invention also relates to a primer for the 

25 amplification by polymerization of an RNA or a DNA of a 
viral material, associated with MS and/or RA, comprising a 
nucleotide sequence identical or equivalent to at least 
one portion of the nucleotide sequence of any fragment 
such as is defined above, in particular a nucleotide 

30 sequence displaying, for any succession of at least 10 
contiguous monomers, preferably 15 contiguous monomers, 
more preferably 18 contiguous monomers and even most 
preferably 20 contiguous monomers, at least 70% homology 
with at least the said portion of the said fragment. 

35 Preferably, the nucleotide sequence of such a primer is 
identical to any one of the sequences selected from the 
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group including SEQ ID NO:47 to SEQ ID NO:50, 
SEQ ID NO:55, SEQ ID NO:64, SEQ ID NO:86, SEQ ID NO:99 to 
SEQ ID NO: 111, SEQ ID NO: 18 3, SEQ ID NO: 184 , 

SEQ ID NO: 185, SEQ ID NO: 186 . 
5 Generally speaking the invention also 

encompasses any RNA or DNA, and in particular replication 
vector , comprising a genomic fragment of the viral 
material such as is defined above, or a nucleotide 
fragment such as is defined above. 

10 The invention also relates to the different 

peptides encoded by any open reading frame belonging to a 
nucleotide fragment such as is defined above, in 
particular any polypeptide, for example any oligopeptide 
forming or comprising an antigenic determinant recognized 

15 by sera of patients infected with the MSRV-1 virus and/or 
in whom the MSRV-1 virus has been reactivated. Preferably, 
this polypeptide is antigenic, and is encoded by the open 
reading frame beginning, in the 5*-2 % direction, at 
nucleotide 181 and ending at nucleotide 330 of 

20 SEQ ID NO:l. 

The invention also encompasses the following 
polypeptides : 
a) 

- a polypeptide encoded by any open reading frame 
25 belonging to a nucleotide fragment, fragment I, as defined 

above ; 

- a polypeptide, characterized in that the open reading 
frame encoding it, is comprised, in the 5'-3' direction, 
between nucleotide 18 and nucleotide 2304 of SEQ ID NO: 93; 

30 - a polypeptide, having a peptide sequence comprising a 
sequence partially or totally identical to SEQ ID NO: 95; 
b) 

- a polypeptide, recombinant or synthetic, having a 
peptide sequence which comprises a sequence identical or 

35 equivalent to SEQ ID NO: 96; in particular said polypeptide 
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exhibits an enzymatic activity consisting of proteolytic 
activity; 

- a polypeptide, recombinant or synthetic, characterized 
in that the open reading frame encoding it begins, in the 

5 5' -3* direction, at nucleotide 18 and ends at nucleotide 
340 of SEQ ID NO:93; 

- a polypeptide having an inhibitory activity on the 
proteolytic activity of a polypeptide as defined according 
to b) ; 

10 c) 

- a polypeptide, recombinant or synthetic, having a 
peptide sequence which comprises a sequence identical or 
equivalent to SEQ ID NO: 97; in particular said polypeptide 
exhibits a reverse transcriptase activity; 

15 - a polypeptide having a peptide sequence which comprises 
a sequence identical or equivalent to SEQ ID NO: 98; in 
particular said polypeptide exhibits a ribonuclease 
activity; 

- a polypeptide, recombinant or synthetic, characterized 
20 in that the open reading frame encoding it begins, in the 

S 1 ^ 1 direction, at nucleotide 341 and ends at nucleotide 
2304 Of SEQ ID NO: 93; 

- a polypeptide, recombinant or synthetic, characterized 
in that the open reading frame encoding it begins, in the 

25 S'-S 1 direction, at nucleotide 1858 and ends at nucleotide 
2304 Of SEQ ID NO: 93. 

- a polypeptide having an inhibitory activity on the 
reverse transcriptase activity of a polypeptide as defined 
according to c) or on the ribonuclease H activity of a 

30 polypeptide as defined according to c) . 

In particular, the invention relates to an 
antigenic polypeptide recognized by the sera of patients 
infected with the MSRV-1 virus, and/or in whom the MSRV-1 
virus has been reactivated, whose peptide sequence is 

35 partially or totally identical or is equivalent to the 
sequence defined by SEQ ID NO: 39, SEQ ID NO: 63, 
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SEQ ID NO: 87, SEQ ID NO:95, SEQ ID NO:96, SEQ ID NO: 97, 
SEQ ID NO: 98, SEQ ID NO: 173 , SEQ ID NO: 174, SEQ ID NO: 175 , 
SEQ ID NO: 180, SEQ ID NO: 181 and SEQ ID NO: 182; such a 
sequence is identical, for example, to any sequence 
5 selected from the group including the sequences 
SEQ ID NO:41 to SEQ ID NO:44, SEQ ID NO:63 and 

SEQ ID NO: 87. 

The present invention also proposes mono- or 
polyclonal antibodies directed against the MSRV-1 virus, 
which are obtained by the immunological reaction of a 
human or animal body or cells to an immunogenic agent 
consisting of an antigenic polypeptide such as is defined 
above . 

The invention next relates to: 

- reagents for detection of the MSRV- virus, or of an 
exposure to the latter, comprising, at least one reactive 
substance selected from the group consisting of a probe of 
the present invention, a polypeptide, in particular an 
antigenic peptide, such as is defined above, or an anti- 
ligand, in particular an antibody to the said polypeptide; 

- all diagnostic, prophylactic or therapeutic compositions 
comprising one or more peptides, in particular antigenic 
peptides, such as are defined above, or one or more anti- 
ligands, in particular antibodies to the peptides, 
discussed above; such a composition is preferably, and by 
way of example, a vaccine composition. 

The invention also relates to any diagnostic, 
prophylactic or therapeutic composition, in particular for 
inhibiting the expression of at least one virus associated 
with MS or RA, and/or the enzymatic activities of the 
proteins of said virus, comprising a nucleotide fragment 
such as is defined above or a polynucleotide, in 
particular oligonucleotide, whose sequence is partially 
identical to that of the said fragment, except for that of 
the fragment having the nucleotide sequence SEQ ID NO:l. 
Likewise, it relates to any diagnostic, prophylactic or 
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therapeutic composition, in particular for inhibiting the 
expression of at least one pathogenic and/or infective 
agent associated with RA, comprising a nucleotide fragment 
such as is defined above by reference to the pol and gag 
5 genes, and especially with respect to the sequences 
SEQ ID NOS 40, 51, 56, 59, 60, 61, 62 and 89. 

According to the invention, these same fragments 
or polynucleotides, in particular oligonucleotides, may 
participate in all suitable compositions for detecting, 

10 according to any suitable process or method, a patho- 
logical and/or infective agent associated with MS and with 
RA, respectively, in a biological sample. In such a 
process, an RNA and/or a DNA presumed to belong or 
originating from the said pathological and/or infective 

15 agent, and/or their complementary RNA and/or DNA, is/are 
brought into contact with such a composition. 

The present invention also relates to any 
process for detecting the presence or exposure to such a 
pathological and/or infective agent, in a biological 

20 sample, by bringing this sample into contact with a 
peptide, in particular an antigenic peptide such as is 
defined above, or an anti-ligand, in particular an anti- 
body to this peptide, such as is defined above. 

In practice, and for example, a device for 

25 detection of the MSRV-1 virus comprises a reagent such as 
is defined above, supported by a solid support which is 
immunologically compatible with the reagent, and a means 
for bringing the biological sample, for example a sample 
of blood or of cerebrospinal fluid, likely to contain 

30 anti-MSRV-1 antibodies, into contact with this reagent 
under conditions permitting a possible immunological 
reaction, the foregoing items being accompanied by means 
for detecting the immune complex formed with this reagent. 

Lastly, the invention also relates to the detec- 

35 tion of anti-MSRV-1 antibodies in a biological sample, for 
example a sample of blood or of cerebrospinal fluid, 
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according to which this sample is brought into contact 
with a reagent such as is defined above , consisting of an 
antibody, under conditions permitting their possible 
immunological reaction, and the presence of the immune 
5 complex thereby formed with the reagent is then detected. 

Before describing the invention in detail, 
different terms used in the description and the claims are 
now defined: 

- strain or isolate is understood to mean any 
10 infective and/or pathogenic biological fraction contain- 
ing, for example, viruses and/or bacteria and/or para- 
sites, generating pathogenic and/or antigenic power, 
harboured by a culture or a living host; as an example, a 
viral strain according to the above definition can contain 

15 a coinfective agent, for example a pathogenic protist, 

- the term "MSRV" used in the present 
description denotes any pathogenic and/or infective agent 
associated with MS, in particular a viral species, the 
attenuated strains of the said viral species or the 

20 defective-interfering particles or particles containing 
coencapsidated genomes, or alternatively genomes 
recombined with a portion of the MSRV-l genome, derived 
from this species. Viruses, and especially viruses 
containing RNA, are known to have a variability resulting, 

25 in particular, from relatively high rates of spontaneous 
mutation (7) , which will be borne in mind below for 
defining the notion of equivalence, 

- human virus is understood to mean a virus 
capable of infecting, or of being harboured by human 

30 beings, 

- in view of all the natural or induced vari- 
ations and/or recombination which may be encountered when 
implementing the present invention, the subjects of the 
latter, defined above and in the claims, have been 

35 expressed including the equivalents or derivatives of the 
different biological materials defined below, in 
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particular of the homologous nucleotide or peptide 
sequences, 

- the variant of a virus or of a pathogenic 
and/or infective agent according to the invention 

5 comprises at least one antigen recognized by at least one 
antibody directed against at least one corresponding 
antigen of the said virus and/or said pathogenic and/or 
infective agent, and/or a genome any part of which is 
detected by at least one hybridization probe and/or at 

10 least one nucleotide amplification primer specific for the 
said virus and/ or pathogenic and/ or infective agent, such 
as, for example, for the MSRV-1 virus, the primers and 
probes having a nucleotide sequence chosen from 
SEQ ID N0:20 to SEQ ID NO:24, SEQ ID N0:26, SEQ ID NO: 16 

15 to SEQ ID N0:19, SEQ ID NO:31 to SEQ ID NO:33, 

SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:48, SEQ ID NO:49 f 
SEQ ID NO: 50, SEQ ID NO:45 and their complementary 
sequences, under particular hybridization conditions well 
known to a person skilled in the art, 

20 - according to the invention, a nucleotide 

fragment or an oligonucleotide or polynucleotide is an 
arrangement of monomers, or a biopolymer, characterized by 
the informational sequence of the natural nucleic acids, 
which is capable of hybridizing with any other nucleotide 

25 fragment under predetermined conditions, it being possible 
for the arrangement to contain monomers of different 
chemical structures and to be obtained from a molecule of 
natural nucleic acid and/or by genetic recombination 
and/or by chemical synthesis; a nucleotide fragment may be 

30 identical to a genomic fragment of the MSRV-1 virus 
discussed in the present invention, in particular a gene 
of this virus, for example pol or env in the case of the 
said virus, 

- thus, a monomer can be a natural nucleotide of 
35 nucleic acid whose constituent elements are a sugar, a 

phosphate group and a nitrogenous base; in RNA the sugar 
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is ribose, in DNA the sugar is 2-deoxyribose; depending on 
whether the nucleic acid is DNA or RNA, the nitrogenous 
base is chosen from adenine, guanine, uracil, cytosine and 
thymine; or the nucleotide can be modified in at least one 
5 of the three constituent elements; as an example, the 
modification can occur in the bases, generating modified 
bases such as inosine, 5-methyldeoxycytidine, 
deoxyuridine, 5- (dimethylamino) deoxyuridine , 2,6- 

diaminopurine, 5-bromodeoxyuridine and any other modified 

10 base promoting hybridization; in the sugar, the 
modification can consist of the replacement of at least 
one deoxyribose by a polyamide (8), and in the phosphate 
group, the modification can consist of its replacement by 
esters chosen, in particular, from diphosphate, alkyl- and 

15 arylphosphonate and phosphorothioate esters, 

- "informational sequence" is understood to mean 
any ordered succession of monomers whose chemical nature 
and order in a reference direction constitute or otherwise 
an item of functional information of the same quality as 

20 that of the natural nucleic acids, 

- hybridization is understood to mean the 
process during which, under suitable working conditions, 
two nucleotide fragments having sufficiently complementary 
sequences pair to form a complex structure, in particular 

25 double or triple, preferably in the form of a helix, 

- a probe comprises a nucleotide fragment syn- 
thesized chemically or obtained by digestion or enzymatic 
cleavage of a longer nucleotide fragment, comprising at 
least six monomers, advantageously from 10 to 1000 mono- 

30 mers, preferably 10 to 30 monomers and more preferably 18 
to 30, and possessing a specificity of hybridization under 
particular conditions; preferably, a probe possessing 
fewer than 10 monomers, but preferably fewer than 15 
monomers is not used alone, but is used in the presence of 

35 other probes of equally short size or otherwise; under 
certain special conditions, it may be useful to use probes 
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of size greater than 100 monomers; a probe may be used, in 
particular, for diagnostic purposes, such molecules being, 
for example, capture and/or detection probes, 

- the capture probe may be immobilized on a 
5 solid support by any suitable means, that is to say 

directly or indirectly, for example by covalent bonding or 
passive adsorption, 

- the detection probe may be labelled by means 
of a label chosen, in particular, from radioactive 

10 isotopes, enzymes chosen, in particular, from peroxidase 
and alkaline phosphatase and those capable of hydrolysing 
a chromogenic, fluorogenic or luminescent substrate, 
chromophoric chemical compounds, chromogenic, fluorogenic 
or luminescent compounds, nucleotide base analogues and 

15 biotin, 

- the probes used for diagnostic purposes of the 
invention may be employed in all known hybridization 
techniques, and in particular the techniques termed "DOT- 
BLOT" (9), "SOUTHERN BLOT" (10), "NORTHERN BLOT", Which is 

20 a technique identical to the "SOUTHERN BLOT" technique but 
which uses RNA as target, and the SANDWICH technique (11); 
advantageously, the SANDWICH technique is used in the 
present invention, comprising a specific capture probe 
and/or a specific detection probe, on the understanding 

25 that the capture probe and the detection probe must 
possess an at least partially different nucleotide 
sequence, 

- any probe according to the present invention 
can hybridize in vivo or in vitro with RNA and/or with DNA 

30 in order to block the phenomena of replication, in 
particular translation and/or transcription, and/or to 
degrade the said DNA and/ or RNA, 

- a primer is a probe comprising at least six 
monomers, and advantageously from 10 to 3 0 monomers, and 

35 preferably from 18 to 25 monomers, possessing a 
specificity of hybridization under particular conditions 
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for the initiation of an enzymatic polymerization, for 
example in an amplification technique such as PCR 
(polymerase chain reaction) , in an elongation process such 
as sequencing, in a method of reverse transcription or the 
5 like, 

- two nucleotide or peptide sequences are termed 
equivalent or derived with respect to one another, or with 
respect to a reference sequence, if functionally the 
corresponding biopolymers can perform substantially the 
same role, without being identical, as regards the 
application or use in question, or in the technique in 
which they participate; two sequences are, in particular, 
equivalent if they are obtained as a result of natural 
variability, in particular spontaneous mutation of the 
species from which they have been identified, or induced 
variability, as are two homologous sequences, homology 
being defined below, 

- "variability" is understood to mean any 
spontaneous or induced modification of a sequence, in par- 
ticular by substitution and/or insertion and/or deletion 
of nucleotides and/or of nucleotide fragments, and/or 
extension and/ or shortening of the sequence at one or both 
ends; an unnatural variability can result from the genetic 
engineering techniques used, for example the choice of 
synthesis primers, degenerate or otherwise, selected for 
amplifying a nucleic acid; this variability can manifest 
itself in modifications of any starting sequence, 
considered as reference, and capable of being expressed by 
a degree of homology relative to the said reference 
sequence, 

- homology characterizes the degree of identity 
of two nucleotide or peptide fragments compared; it is 
measured by the percentage identity which is determined, 
in particular, by direct comparison of nucleotide or 
peptide sequences, relative to reference nucleotide or 
peptide sequences, 
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- this percentage identity has been specifically 
determined for the nucleotide fragments, clones in 
particular, dealt with in the present invention, which are 
homologous to the fragments identified, for the MSRV-l 

5 virus, by SEQ ID N0:1 to N0:9, SEQ ID NO:46, SEQ ID NO:51 
to SEQ ID NO:53, SEQ ID NO:40, SEQ ID NO:56, SEQ ID NO:57 
and SEQ ID NO: 93, as well as for the probes and primers 
homologous to the probes and primers identified by SEQ ID 
NO: 20 to SEQ ID NO: 24, SEQ ID NO: 26, SEQ ID NO: 16 to SEQ 

10 ID NO:19, SEQ ID NO:31 to SEQ ID NO:33, SEQ ID NO:45, SEQ 
ID NO:47, SEQ ID NO:48, SEQ ID NO:49, SEQ ID NO:50, SEQ ID 
NO:55, SEQ ID NO:40, SEQ ID NO:56, SEQ ID NO:57 and SEQ ID 
NO: 99 to SEQ ID NO: 111; as an example, the smallest 
percentage identity observed between the different general 

15 consensus sequences of nucleic acids obtained from 
fragments of MSRV-l viral RNA, originating from the LM7PC 
and PLI-2 lines according to a protocol detailed later, is 
67% in the region described in Figure 1, 

- any nucleotide fragment is termed equivalent 
20 or derived from a reference fragment if it possesses a 

nucleotide sequence equivalent to the sequence of the 
reference fragment; according to the above definition, the 
following in particular are equivalent to a reference 
nucleotide fragment: 
25 a) any fragment capable of hybridizing at least 

partially with the complement of the reference fragment, 

b) any fragment whose alignment with the refer- 
ence fragment results in the demonstration of a larger 
number of identical contiguous bases than with any other 
fragment originating from another taxonomic group, 

c) any fragment resulting, or capable of result- 
ing, from the natural variability of the species from 
which it is obtained, 

d) any fragment capable of resulting from the 
genetic engineering techniques applied to the reference 
fragment, 
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e) any fragment containing at least eight 
contiguous nucleotides encoding a peptide which is 
homologous or identical to the peptide encoded by the 
reference fragment, 
5 f ) any fragment which is different from the 

reference fragment by insertion, deletion or substitution 
of at least one monomer, or extension or shortening at one 
or both of its ends; for example, any fragment 
corresponding to the reference fragment flanked at one or 
10 both of its ends by a nucleotide sequence not coding for a 
polypeptide, 

- polypeptide is understood to mean, in particu- 
lar, any peptide of at least two amino acids, in particu- 
lar an oligopeptide, or protein, and for example an 

15 enzyme, extracted, separated or substantially isolated or 
synthesized through human intervention, in particular 
those obtained by chemical synthesis or by expression in a 
recombinant organism, 

- polypeptide partially encoded by a nucleotide 
20 fragment is understood to mean a polypeptide possessing at 

least three amino acids encoded by at least nine 
contiguous monomers lying within the said nucleotide 
fragment, 

- an amino acid is termed analogous to another 
25 amino acid when their respective physicochemical prop- 
erties, such as polarity, hydrophobic ity and/or basicity 
and/or acidity and/or neutrality are substantially the 
same; thus, a leucine is analogous to an isoleucine. 

- any polypeptide is termed equivalent or 
30 derived from a reference polypeptide if the polypeptides 

compared have substantially the same properties, and in 
particular the same antigenic, immunological, 
enzymological and/or molecular recognition properties; the 
following in particular are equivalent to a reference 
35 polypeptide: 
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a) any polypeptide possessing a sequence in 
which at least one amino acid has been replaced by an 
analogous amino acid, 

b) any polypeptide having an equivalent peptide 
5 sequence, obtained by natural or induced variation of the 

said reference polypeptide and/or of the nucleotide 
fragment coding for the said polypeptide, 

c) a mimotope of the said reference polypeptide, 

d) any polypeptide in whose sequence one or more 
10 amino acids of the L series are replaced by an amino acid 

of the D series, and vice versa, 

e) any polypeptide into whose sequence a modifi- 
cation of the side chains of the amino acids has been 
introduced, such as, for example, an acetylation of the 

15 amine functions, a carboxylation of the thiol functions, 
an esterif ication of the carboxyl functions, 

f) any polypeptide in whose sequence one or more 
peptide bonds have been modified, such as, for example, 
carba, retro, inverso, retro- inver so, reduced and roethy- 

20 lenoxy bonds, 

(g) any polypeptide at least one antigen of 
which is recognized by an antibody directed against a 
reference polypeptide, 

- the percentage identity characterizing the 
25 homology of two peptide fragments compared is, according 
to the present invention, at least 50% and preferably at 
least 70%. 

In view of the fact that a virus possessing 
reverse transcriptase enzymatic activity may be geneti- 

30 cally characterized equally well in RNA and in DNA form, 
both the viral DNA and RNA will be referred to for 
characterizing the sequences relating to a virus possess- 
ing such reverse transcriptase activity, termed MSRV-1 
according to the present description. 

35 The expressions of order used in the present 

description and the claims, such as "first nucleotide 
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sequence" , are not adopted so as to express a particular 
order, but so as to define the invention more clearly. 

Detection of a substance or agent is understood 
below to mean both an identification and a quantification, 
5 or a separation or isolation, of the said substance or 
said agent. 

A better understanding of the invention will be 
gained on reading the detailed description which follows, 
prepared with reference to the attached figures, in which: 

10 - Figure 1 shows general consensus sequences of 

nucleic acids of the MSRV-1B clones amplified by the PCR 
technique in the "pol" region defined by Shih (12), from 
viral DNA originating from the LM7PC and PLI-2 lines, and 
identified under the references SEQ ID NO: 3, SEQ ID NO: 4, 

15 SEQ ID NO: 5 and SEQ ID NO: 6, and the common consensus with 
amplification primers bearing the reference SEQ ID NO: 7; 

- Figure 2 gives the definition of a functional 
reading frame for each MSRV-1B/"PCR pol" type family, the 
said families A to D being defined, respectively, by the 

20 nucleotide sequences SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 
and SEQ ID NO: 6 described in Figure 1; 

- Figure 3 gives an example of consensus of the 
MSRV-2B sequences, identified by SEQ ID NO: 11; 

- Figure 4 is a representation of the reverse 
25 transcriptase (RT) activity in dpm (disintegrations per 

minute) in the sucrose fractions taken from a purification 
gradient of the virions produced by the B lymphocytes in 
culture from a patient suffering from MS; 

- Figure 5 gives, under the same experimental 
3 0 conditions as in Figure 4, the assay of the reverse 

transcriptase activity in the culture of a B lymphocyte 
line obtained from a control free from MS; 

- Figure 6 shows the nucleotide sequence of the 
clone PSJ17 (SEQ ID NO:9); 

35 - Figure 7 shows the nucleotide sequence SEQ ID 

NO: 8 of the clone designated M003-P004; 
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- Figure 8 shows the nucleotide sequence SEQ ID 
NO: 2 of the clone Fll-1; the portion located between the 
two arrows in the region of the primer corresponds to a 
variability imposed by the choice of primer which was used 

5 for the cloning of Fll-1; in this same figure , the 
translation into amino acids is shown; 

- Figure 9 shows the nucleotide sequence SEQ ID 
N0:1, and a possible functional reading frame of SEQ ID 
N0:1 in terms of amino acids; on this sequence, the 

10 consensus sequences of the pol gene are underlined; 

- Figures 10 and 11 give the results of a PCR, 
in the form of a photograph under ultraviolet light of an 
ethidium bromide- impregnated agarose gel, of the amplifi- 
cation products obtained from the primers identified by 

15 SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18 and SEQ ID NO: 19; 

- Figure 12 gives a representation in matrix 
form of the homology between SEQ ID N0:1 of MSRV-l and 
that of an endogenous retrovirus designated HSERV9; this 
homology of at least 65% is demonstrated by a continuous 

20 line, the absence of a line meaning a homology of less 
than 65%; 

- Figure 13 shows the nucleotide sequence SEQ ID 
NO: 46 of the clone FBd3 ; 

- Figure 14 shows the sequence homology between 
25 the clone FBd3 and the HSERV-9 retrovirus; 

- Figure 15 shows the nucleotide sequence SEQ ID 
NO: 51 of the clone t pol; 

- Figures 16 and 17 show, respectively, the 
nucleotide sequences SEQ ID NO: 52 and SEQ ID NO: 53 of the 

30 clones JLBcl and JLBc2 , respectively; 

- Figure 18 shows the sequence homology between 
the clone JLBcl and the clone FBd3; 

- and Figure 19 the sequence homology between 
the clone JLBc2 and the clone FBd3; 

35 - Figure 20 shows the sequence homology between 

the clones JLBcl and JLBc2; 
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- Figures 21 and 22 show the sequence homology 
between the HSERV-9 retrovirus and the clones JLBcl and 
JLBc2 , respectively ; 

- Figure 23 shows the nucleotide sequence SEQ ID 
5 NO: 56 of the clone GM3; 

- Figure 2 4 shows the sequence homology between 
the HSERV-9 retrovirus and the clone GM3; 

- Figure 25 shows the localization of the 
different clones studied, relative to the genome of the 
known retrovirus ERV9; 

- Figure 26 shows the position of the clones 
Fll-1, M003-P004, MSRV-1B and PSJ17 in the region 
hereinafter designated MSRV-1 pol*; 

- Figure 27, split into three successive Figures 
27a-27c, shows a possible reading frame covering the whole 
of the pol gene; 

- Figure 28 shows, according to SEQ ID NO: 40, 
the nucleotide sequence coding for the peptide fragment 
POL2B, having the amino acid sequence identified by SEQ ID 
NO: 39; 

- Figure 29 shows the OD values (ELISA tests) at 
492 nm obtained for 29 sera of MS patients and 32 sera of 
healthy controls tested with an anti-IgG antibody; 

- Figure 30 shows the OD values (ELISA tests) at 
492 nm obtained for 36 sera of MS patients and 42 sera of 
healthy controls tested with an anti-IgM antibody; 

- Figures 31 to 3 3 show the results obtained 
(relative intensity of the spots) for 43 overlapping 
octapeptides covering the amino acid sequence 61-110, 
according to the Spotscan technique, respectively with a 
pool of MS sera, with a pool of control sera and with the 
pool of MS sera after deduction of a background corre- 
sponding to the maximum signal detected on at least one 
octapeptide with the control serum (intensity = 1) , on the 
understanding that these sera were diluted to 1/50. The 
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bar at the far right-hand end represents a graphic scale 
standard unrelated to the serological test; 

- Figure 34 shows the SEQ ID NO: 41 and SEQ ID 
NO: 42 of two polypeptides comprising immunodominant 

5 regions, while SEQ ID NO: 4 3 and 44 represent 
immunoreactive polypeptides specific to MS; 

- Figure 35 shows the nucleotide sequence SEQ ID 
NO: 59 of the clone LB19 and three potential reading frames 
of SEQ ID NO: 59 in terms of amino acids; 

10 - Figure 36 shows the nucleotide sequence SEQ ID 

NO: 88 (GAG* ) and a potential reading frame of SEQ ID NO: 88 
in terms of amino acids; 

- Figure 37 shows the sequence homology between 
the clone FBdl3 and the HSERV-9 retrovirus; according to 

15 this representation, the continuous line means a 
percentage homology greater than or equal to 70% and the 
absence of a line means a smaller percentage homology; 

- Figure 38 shows the nucleotide sequence SEQ ID 
NO: 61 of the clone FP6 and three potential reading frames 

20 of SEQ ID NO: 61 in terms of amino acids; 

- Figure 39 shows the nucleotide sequence SEQ ID 
NO: 89 of the clone G+E+A and three potential reading 
frames of SEQ ID NO: 89 in terms of amino acids; 

- Figure 40 shows a reading frame found in the 
25 region E and coding for an MSRV-l retroviral protease 

identified by SEQ ID NO: 90; 

- Figure 41 shows the response of each serum of 
patients suffering from MS, indicated by the symbol (+) , 
and of healthy patients, symbolised by (-) , tested with an 

30 anti-IgG antibody, expressed as net optical density at 
492 nm; 

- Figure 4 2 shows the response of each serum of 
patients suffering from MS, indicated by the symbols (+) 
and (QS) , and of healthy patients (-), tested with an 

35 anti-IgM antibody, expressed as net optical density at 
492 nm; 
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- Figure 43 shows the RT-activity profile in 
sucrose density gradients of pellets from B-cell lines 
supernatants; Control B-cell line ■ was obtained from the 
relative of a patient with mitochondriopathy . MS B-Cell 

5 line □ was obtained from a patient with definite MS; 

- Figure 44 shows the nucleotide and amino acid 
alignment of the conserved pol regions of viruses detected 
in the study (cf Example 18) by the "Pan-retrovirus" PCR. 
"Deletions" are represented by dashes and standard single- 

10 letter abbreviations are used to designate amino acids and 
nucleotides (i = inosine) . The most highly conserved VLPQG 
and YXDD regions are shown as separate blocks in bold type 
at the end of each sequence. Amino acids which are present 
in all or in all but one of the sequences are underlined. 

15 PCR primers (modified from (12)) PAN-UO and PAN-UI are 
orientated 5' to 3' (sense) whereas primer PAN-DI is 3 ' to 
5 1 (antisense) . Degeneracies are shown above (PAN-UO & 
PAN-DI) or below (PAN-UI) the PCR primer sequences. 
"I" denotes the nine base 5' extension cttggatcc, 

20 denotes the nine base 5' extension ctcaagctt. The capture 
and detector probes DpVl and CpVlb used in the ELOSA assay 
are shown below a representative MSRV-cpol sequence. At 
three positions below the translated MSRV-cpol sequence 
alternative amino acids (representing "non-silent" nucleic 

25 acid variations) are shown in italics - K and Y 
substitutions were only observed in PLI-1 derived clones 
whereas R and W were encoded by a significant proportion 
of the clones irrespective of derivation. Note that DpVl 
is peroxidase labelled and that CpVlb may be biotinylated 

30 at the 5 1 end if streptavidin coated plates are used. The 
name of each sequence is indicated at the left of the 
f igure . 

HTLVl: Human Leukaemia Virus type 1; HIV1: Human 
Immunodeficiency Virus type 1; MoMLV: Moloney-Murine 
35 Leukaemia Virus; MPMV: Mason-Pfizer Monkey Virus. ERV9: 
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Endogenous Retrovirus 9. MSRV-cpol: Multiple Sclerosis 
associated Retrovirus conserved pol region. 

- Figure 45 shows a phylogenic tree which is 
based on the conserved amino acid region encoded by the 

5 pol gene of MSRV and of representative endogenous and 
exogenous retroviruses and DNA viruses with reverse 
transcriptase. It was generated by the U.P.G.M.A. tree 
program of Geneworks® software. 

HSRV: Human Spumaretrovirus . EIAV: Equine Infectious 

10 Aenemia Virus. BLV: Bovine Leukaemia Virus. HIV1, HIV2: 
Human Immunodeficiency Viruses type 1 and 2. HTLV1 and 
HTLV2: Human Leukaemia Viruses type 1 and 2. F-MuLV: 
Friend-Murine Leukaemia Virus. MoMLV: Moloney-Murine 
Leukaemia Virus. BAEV: Baboon Endogenous Virus. GaLV/ 

15 Gibbon Ape Leukaemia Virus. HUMER41: Human Endogenous 
Retroviral sequence, clone 41. IAP: Intracisternal A- type 
Particle. MPMV: Mason-Pfizer Monkey Virus. HERVK10 : Human 
Endogenous Retrovirus K10. MMTV: Mouse Mammary tumour 
Virus. HSERV9 (ERV9 database sequence): Human sequence of 

20 Endogenous Retrovirus 9. MSRV: Multiple Sclerosis 
associated Retrovirus. SIV: Simian Immunodeficiency Virus; 
RTLV-H: Reverse Transcriptase-Like Viral sequence H; SFV: 
Simian Foamy Virus; VISNA: Visna retrovirus; SIV1: Simian 
Immunodeficiency Virus type 1; SRV-2: Simian Retrovirus 

25 type 2; SMRV-H: Squirrel Monkey Retrovirus H. 

- Figure 46 shows the MSRV sequence in the 
Protease and Reverse-Transcriptase regions of the pol 
gene. 

The aminoacid translation is aligned under the 
30 corresponding nucleotide sequence. The region 

corresponding to the Protease ORF cloned in a recombinant 

vector and expressed in E. coli, is boxed. The regions 
. corresponding to the A and B fragments amplified on plasma 

samples from MS patients are indicated by brackets. The 
35 Reverse-Transcriptase (RT) and RNase H (RNH) region is 

boxed with dotted line. The highly conserved aminoacids 
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and/or active sites of enzyme activities of both PRT and 
RT (including RNH) are shown underlined. 

- Figure 47A illustrates the pecific detection 
of MSRV-pol RNA sequence by RT-PCR in the sucrose density 

5 fraction associated with RT-activity and in MS plasma ; 
Figure 47B shows the RT-activity profile on a sucrose 
density gradient obtained with extracellular virion 
pelleted from an MS choroid-plexus culture. The photograph 
below shows an agarose gel loaded with PCR products 

10 amplified from round l (ST1.1) RT-PCR products with the 
ST1.2 primer set. From left to right: water control 1 from 
RT-PCR step with ST1.1 set; water control 2 amplified from 
water control 1 with ST1.2 nested primers; Molecular 
weight markers; Fraction n°l to 10 corresponding to the 

15 RT-activity profile shown above; Plasma samples CI and C2 
from healthy blood donors. Plasma samples MSI and MS2 from 
two MS patients. 

- Figure 48 shows an example of a variant and/or 
recombined sequence in the region of the pol gene defined 

20 by homology with the overlapping regions described in 
Figure 25, as GM3 , MSRV-1 pol*, t pol and FBd3 . 

- Figure 49 shows the nucleotide (Figure 49A) 
and amino acid (Figure 49B) alignments of the pol region 
between clones 1, 5 and 8 of the same patient (Experiment 

25 46-7). 

- Figure 50 shows the nucleotide (Figure 50A) 
and amino acid (Figure 50B) alignments of the pol region 
between clones 41, 43 and 42 of the same patient 
(Experiment 68-1) . 

30 - Figure 51 shows the nucleotide (Figure 51A) 

and amino acid (Figure 51B) alignments of the pol region 
between the consensus sequence (SEQ ID NO: 176) of clones 
1, 5 and 8 of the same patient (Experiment 4 6-7) and 
SEQ ID N0:1, and between their corresponding peptide 

35 sequences. 
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- Figure 52 shows the nucleotide (Figure 52A) 
and amino acid (Figure 52B) alignments of the pol region 
between the consensus sequence (SEQ ID NO: 169) of clones 
41 , 43 and 42 of the same patient (Experiment 68-1) and 

5 SEQ ID NO:l, and between their corresponding peptide 
sequences. 

- Figure 53 shows the nucleotide (Figure 53A) 
and amino acid (Figure 53B) alignments of the pol region 
between the consensus sequence (SEQ ID NO: 176) of clones 

10 1, 5 and 8 of the same patient (Experiment 4 6-7) and the 

consensus sequence (SEQ ID NO: 169) of clones 41, 4 3 and 

42 of the same patient (Experiment 68-1) . 

Table 5 (at the end of the description) shows 

the sequences obtained by RT-PCR with degenerate pol 
15 primers on sucrose density gradient fractions containing 

the peak of RT-activity or its negative control (cf 

Example 18) ; and 

Table 6 (at the end of the description) shows 

the clinical data and results of MSRV-cpol detection by 
20 "Pan-retro" PCR with specific ELOSA assay , on CSF from MS 

and control patients (cf Example 18) . 

EXAMPLE l: OBTAINING CLONES DESIGNATED MSRV-1B 
AND MSRV-2B, DEFINING, RESPECTIVELY , A RETROVIRUS MSRV-1 
25 AND A COINFECTIVE AGENT MSRV2 , BY "NESTED" PCR AMPLIFICA- 
TION OF THE CONSERVED POL REGIONS OF RETROVIRUSES ON 
VIRION PREPARATIONS ORIGINATING FROM THE LM7PC AND PLI-2 
LINES 

A PCR technique derived from the technique 
30 published by Shih (12) was used. This technique enables 
all trace of contaminant DNA to be removed by treating all 
the components of the reaction medium with DNase. It 
concomitantly makes it possible, by the use of different 
but overlapping primers in two successive series of PCR 
35 amplification cycles, to increase the chances of amplify- 
ing a cDNA synthesized from an amount of RNA which is 
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small at the outset and further reduced in the sample by 
the spurious action of the DNAse on the RNA. In effect, 
the DNase is used under conditions of activity in excess 
which enable all trace of contaminant DNA to be removed 
5 before inactivation of this enzyme remaining in the sample 
by heating to 85°C for 10 minutes. This variant of the PCR 
technique described by Shih (12) was used on a cDNA 
synthesized from the nucleic acids of fractions of 
infective particles purified on a sucrose gradient 

10 according to the technique described by H. Perron (13) 
from the "POL-2" isolate (ECACC No. V92072202) produced by 
the PLI-2 line (ECACC No. 92072201) on the one hand, and 
from the MS7PG isolate (ECACC No. V93010816) produced by 
the LM7PC line (ECACC No. 93010817) on the other hand. 

15 These cultures were obtained according to the methods 
which formed the subject of the patent applications 
published under Nos WO 93/20188 and WO 93/20189. 

After cloning the products amplified by this 
technique with the TA Cloning Kit® and analysis of the 

20 sequence using an Applied Biosystems model 373A Automatic 
Sequencer, the sequences were analysed using the 
Geneworks® software on the latest available version of the 
Genebank® data bank. 

The sequences cloned and sequenced from these 

25 samples correspond, in particular, to two types of 
sequence: a first type of sequence, to be found in the 
majority of the clones (55% of the clones originating from 
the POL-2 isolates of the PLI-2 culture, and 67% of the 
clones originating from the MS7PG isolates of the LM7PC 

30 cultures) , which corresponds to a family of "pol" 
sequences closely similar to, but different from, the 
endogenous human retrovirus designated ERV-9 or HSERV-9, 
and a second type of sequence which corresponds to 
sequences very strongly homologous to a sequence 

35 attributed to another infective and/or pathogenic agent 
designated MSRV-2. 
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The first type of sequence, representing the 
majority of the clones, consists of sequences whose 
variability enables four subfamilies of sequences to be 
defined. These subfamilies are sufficiently similar to one 
5 another for it to be possible to consider them to be 
quasi-species originating from the same retrovirus, as is 
well known for the HIV-1 retrovirus (14), or to be the 
outcome of interference with several endogenous proviruses 
coregulated in the producing cells. These more or less 

10 defective endogenous elements are sensitive to the same 
regulatory signals possibly generated by a replicative 
provirus, since they belong to the same family of 
endogenous retroviruses (15). This new family of 
endogenous retroviruses, or alternatively this new 

15 retroviral species from which the generation of quasi- 
species has been obtained in culture, and which contains a 
consensus of the sequences described below, is designated 
MSRV-1B. 

Figure 1 presents the general consensus 

20 sequences of the sequences of the different MSRV-1B clones 
sequenced in this experiment, these sequences being 
identified, respectively, by SEQ ID NO: 3, SEQ ID NO: 4, SEQ 
ID NO: 5 and SEQ ID NO: 6. These sequences display a 
homology with respect to nucleic acids ranging from 70% to 

25 88% with the HSERV9 sequence referenced X57147 and M37638 
in the Genebank® data base. Four "consensus" nucleic acid 
sequences representative of different quasi-species of a 
possibly exogenous retrovirus MSRV-1B, or of different 
subfamilies of an endogenous retrovirus MSRV-1B, have been 

30 defined. These representative consensus sequences are 
presented in Figure 2, with the translation into amino 
acids. A functional reading frame exists for each 
subfamily of these MSRV-1B sequences, and it can be seen 
that the functional open reading frame corresponds in each 

35 instance to the amino acid sequence appearing on the 
second line under the nucleic acid sequence. The general 
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consensus of the MSRV-1B sequence, identified by SEQ ID 
NO: 7 and obtained by this PCR technique in the "pol" 
region , is presented in Figure 1. 

The second type of sequence representing the 
5 majority of the clones sequenced is represented by the 
sequence MSRV-2B presented in Figure 3 and identified by 
SEQ ID NO: 11. The differences observed in the sequences 
corresponding to the PCR primers are explained by the use 
of degenerate primers in mixture form used under different 

10 technical conditions. 

The MSRV-2B sequence (SEQ ID NO: 11) is suffic- 
iently divergent from the retroviral sequences already 
described in the data banks for it to be suggested that 
the sequence region in question belongs to a new infective 

15 agent, designated MSRV-2. This infective agent would, in 
principle, on the basis of the analysis of the first 
sequences obtained, be related to a retrovirus but, in 
view of the technique used for obtaining this sequence, it 
could also be a DNA virus whose genome codes for an enzyme 

20 which incidentally possesses reverse transcriptase 
activity, as is the case, for example, with the hepatitis 
B virus, HBV (12) . Furthermore, the random nature of the 
degenerate primers used for this PCR amplification 
technique may very well have permitted, as a result of 

25 unforeseen sequence homologies or of conserved sites in 
the gene for a related enzyme, the amplification of a 
nucleic acid originating from a prokaryotic or eukaryotic 
pathogenic and/or coinfective agent (protist) . 

30 EXAMPLE 2: OBTAINING CLONES DESIGNATED MSRV-1B 

AND MSRV-2 B, DEFINING A FAMILY MSRV-1 and MSRV-2, BY 
"NESTED" PCR AMPLIFICATION OF THE CONSERVED POL REGIONS OF 
RETROVIRUSES ON PREPARATIONS OF B LYMPHOCYTES FROM A NEW 
CASE OF MS 

35 The same PCR technique, modified according to 

the technique of Shih (12), was used to amplify and 
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sequence the RNA nucleic acid material present in a 
purified fraction of virions at the peak of "LM7-like M 
reverse transcriptase activity on a sucrose gradient 
according to the technique described by H. Perron (13), 
5 and according to the protocols mentioned in Example 1, 
from a spontaneous lymphoblastoid line obtained by self- 
immortalization in culture of B lymphocytes from an MS 
patient who was seropositive for the Epstein-Barr virus 
(EBV) , after setting up the blood lymphoid cells in 

10 culture in a suitable culture medium containing a suitable 
concentration of cyclosporin A. A representation of the 
reverse transcriptase activity in the sucrose fractions 
taken from a purification gradient of the virions produced 
by this line is presented in Figure 4. Similarly, the 

15 culture supernatants of a B line obtained under the same 
conditions from a control free from MS were treated under 
the same conditions, and the assay of reverse 
transcriptase activity in the sucrose gradient fractions 
proved negative throughout (background) , and is presented 

20 in Figure 5. Fraction 3 of the gradient corresponding to 
the MS B line and the same fraction without reverse 
transcriptase activity of the non-MS control gradient were 
analysed by the same RT-PCR technique as before, derived 
from Shih (12), followed by the same steps of cloning and 

25 sequencing as described in Example 1. 

It is particularly noteworthy that the MSRV-1 
and MSRV-2 type sequences are to be found only in the 
material associated with a peak of "LM7-like" reverse 
transcriptase activity originating from the MS B lympho- 

30 blastoid line. These sequences were not to be found with 
the material from the control (non-MS) B lymphoblastoid 
line in 2 6 recombinant clones taken at random. Only 
Mo-MuLV type contaminant sequences, originating from the 
commercial reverse transcriptase used for the cDNA 

35 synthesis step, and sequences without any particular 
retroviral analogy were to be found in this control, as a 
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result of the "consensus" amplification of homologous 
polymerase sequences which is produced by this PCR 
technique. Furthermore, the absence of a concentrated 
target which competes for the amplification reaction in 
5 the control sample permits the amplification of dilute 
contaminants. The difference in results is manifestly 
highly significant (chi-squared, p<0.001). 

EXAMPLE 3: OBTAINING A CLONE PSJ17, DEFINING A 
10 RETROVIRUS MSRV-1, BY REACTION OF ENDOGENOUS REVERSE 
TRANSCRIPTASE WITH A VIRION PREPARATION ORIGINATING FROM 
THE PLI-2 LINE 

This approach is directed towards obtaining 
reverse-transcribed DNA sequences from the supposedly 

15 retroviral RNA in the isolate using the reverse trans- 
criptase activity present in this same isolate. This 
reverse transcriptase activity can theoretically function 
only in the presence of a retroviral RNA linked to a 
primer tRNA or hybridized with short strands of DNA 

20 already reverse-transcribed in the retroviral particles 
(16) . Thus, the obtaining of specific retroviral sequences 
in a material contaminated with cellular nucleic acids was 
optimized according to these authors by means of the 
specific enzymatic amplification of the portions of viral 

25 RNAs with a viral reverse transcriptase activity. To this 
end, the authors determined the particular physicochemical 
conditions under which this enzymatic activity of reverse 
transcription on RNAs contained in virions could be 
effective in vitro. These conditions correspond to the 

30 technical description of the protocols presented below 
(endogenous RT reaction, purification, cloning and 
sequencing) . 

The molecular approach consisted in using a 
preparation of concentrated but unpurified virion obtained 
35 from the culture supernatants of the PLI-2 line, prepared 
according to the following method: the culture 
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supernatants are collected twice weekly, precentrif uged at 
10,000 rpm for 30 minutes to remove cell debris and then 
frozen at -80°C or used as they are for the following 
steps. The fresh or thawed supernatants are centrifuged on 
5 a cushion of 30% glycerol-PBS at 100,000 g (or 30,000 rpm 
in a type 45 T LKB-HITACHI rotor) for 2 h at 4°C. After 
removal of the supernatant, the sedimented pellet is taken 
up in a small volume of PBS and constitutes the fraction 
of concentrated but unpurified virion. This concentrated 

10 but unpurified viral sample was used to perform a so- 
called endogenous reverse transcription reaction, as 
described below. 

A volume of 200 ml of virion purified according 
to the protocol described above, and containing a reverse 

15 transcriptase activity of approximately 1-5 million dpm, 
is thawed at 37 °C until a liquid phase appears, and then 
placed on ice. A 5-fold concentrated buffer was prepared 
with the following components: 500 mM Tris-HCl pH 8.2; 
75 mM NaCl; 25 mM MgCl 2 ; 75 mM DTT and 0.10% NP 40; 100 ml 

20 of 5X buffer + 25 ml of a 100 mM solution of dATP + 25 ml 
of a 100 mM solution of dTTP + 25 ml of a 100 mM solution 
of dGTP + 25 ml of a 100 mM solution of dCTP + 100 ml of 
sterile distilled water + 200 ml of the virion suspension 
(RT activity of 5 million DPM) in PBS were mixed and 

25 incubated at 42 °C for 3 hours. After this incubation, the 
reaction mixture is added directly to a buffered 
phenol/chloroform/ isoamyl alcohol mixture (Sigma ref. 
P 3803) ; the aqueous phase is collected and one volume of 
sterile distilled water is added to the organic phase to 

30 re-extract the residual nucleic acid material. The 
collected aqueous phases are combined, and the nucleic 
acids contained are precipitated by adding 3M sodium 
acetate pH 5.2 to 1/10 volume + 2 volumes of ethanol + 
1 ml of glycogen (Boehringer-Mannheim ref. 901 393) and 

35 placing the sample at -20°C for 4 h or overnight at +4°C. 
The precipitate obtained after centrif ugation is then 
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washed with 70% ethanol and resuspended in 60 ml of 
distilled water. The products of this reaction were then 
purified, cloned and sequenced according to the protocol 
which will now be described: blunt-ended DNAs with 
5 unpaired adenines at the ends were generated: a "filling- 
in" reaction was first performed: 25 ml of the previously 
purified DNA solution were mixed with 2 ml of a 2.5 mM 
solution containing , in equimolar amounts, dATP + dGTP + 
dTTP + dCTP/1 ml of T4 DNA polymerase (Boehringer-Mannheim 

10 ref. 1004 786) / 5 ml of 10X "incubation buffer for 
restriction enzyme" (Boehringer-Mannheim ref. 1417 975) / 
1 ml of a l% bovine serum albumin solution / 16 ml of 
sterile distilled water. This mixture was incubated for 
20 minutes at 11 °C. 50 ml of TE buffer and 1 ml of 

15 glycogen (Boehringer-Mannheim ref. 901 393) were added 
thereto before extraction of the nucleic acids with 
phenol/chloroform/ isoamyl alcohol (Sigma ref. P 3803) and 
precipitation with sodium acetate as described above. The 
DNA precipitated after centrif ugation is resuspended in 

20 10 ml of 10 mM Tris buffer pH 7.5. 5 ml of this suspension 
were then mixed with 20 ml of 5X Taq buffer, 20 ml of 5 mM 
dATP, 1 ml (5U) of Taq DNA polymerase (AmplitaqTM) and 
54 ml of sterile distilled water. This mixture is 
incubated for 2 h at 75 °C with a film of oil on the 

25 surface of the solution. The DNA suspended in the aqueous 
solution drawn off under the film of oil after incubation 
is precipitated as described above and resuspended in 2 ml 
of sterile distilled water. The DNA obtained was inserted 
into a plasmid using the TA CloningTM kit. The 2 ml of DNA 

30 solution were mixed with 5 ml of sterile distilled water, 
1 ml of a 10-fold concentrated ligation buffer "10X 
LIGATION BUFFER", 2 ml of "pCR™ VECTOR" (2 5 ng/ml) and 
1 ml of "TA DNA LIGASE" . This mixture was incubated 
overnight at 12 °C. The following steps were carried out 

35 according to the instructions of the TA Cloning™ kit 
(British Biotechnology) . At the end of the procedure, the 
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white colonies of recombinant bacteria (white) were picked 
out in order to be cultured and to permit extraction of 
the plasmids incorporated according to the so-called 
"miniprep" procedure (17). The plasmid preparation from 
5 each recombinant colony was cut with a suitable 
restriction enzyme and analysed on agarose gel* Plasmids 
possessing an insert detected under UV light after 
staining the gel with ethidium bromide were selected for 
sequencing of the insert, after hybridization with a 

10 primer complementary to the Sp6 promoter present on the 
cloning plasmid of the TA cloning™ kit. The reaction prior 
to sequencing was then performed according to the method 
recommended for the use of the sequencing kit "Prism ready 
reaction kit dye deoxyterminator cycle sequencing kit" 

15 (Applied Biosystems, ref. 401384), and automatic 
sequencing was carried out with an Applied Biosystems 
"Automatic Sequencer, model 373 A" apparatus according to 
the manufacturer's instructions. 

Discriminating analysis on the computerized data 

20 banks of the sequences cloned from the DNA fragments 
present in the reaction mixture enabled a retroviral type 
sequence to be revealed. The corresponding clone PSJ17 was 
completely sequenced, and the sequence obtained, presented 
in Figure 6 and identified by SEQ ID NO: 9, was analysed 

25 using the "Geneworks®" software on the updated "Genebank™" 
data banks. An identical sequence already described could 
not be found by analysis of the data banks. Only a partial 
homology with some known retroviral elements was to be 
found. The most useful relative homology relates to an 

30 endogenous retrovirus designated ERV-9, or HSERV-9, 
according to the references (18) . 



35 
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42 

EXAMPLE 4: PCR AMPLIFICATION OF THE NUCLEIC ACID 
SEQUENCE CONTAINED BETWEEN THE 5 1 REGION DEFINED BY THE 
CLONE "POL MSRV-1B" AND THE 3» REGION DEFINED BY THE CLONE 
PSJ17 

5 Five oligonucleotides, M001, M002-A, M003-BCD, 

P004 and P005, were defined in order to amplify the RNA 
originating from purified POL-2 virions. Control reactions 
were performed so as to check for the presence of 
contaminants (reaction with water) . The amplification 
consists of an RT-PCR step according to the protocol 
described in Example 2, followed by a "nested" PCR 
according to the PCR protocol described in the document 
EP-A-O, 569,272, In the first RT-PCR cycle, the primers 
M001 and P004 or P005 are used. In the second PCR cycle, 
the primers M002-A or M003-BCD and the primer P004 are 
used. The primers are positioned as follows: 
M002-A 
M003-BCD 

M001 P004 POOS 



RNA 

POL-2 

< > < > 

pol MSRV-1B PSJ17 

Their composition is: 
primer M001: GGTCITICCICAIGG (SEQ ID NO: 20) 
primer M002-A: TTAGGGATAGCCCTCATCTCT (SEQ ID NO: 21) 
primer M003-BCD: TCAGGGATAGCCCCCATCTAT (SEQ ID NO: 22) 
primer P004: AACCCTTTGCCACTACATCAATTT (SEQ ID NO: 23) 
primer POOS: GCGTAAGGACTCCTAGAGCTATT (SEQ ID NO:24) 

The "nested" amplification product obtained, and 
designated M003-P004, is presented in Figure 7, and 
corresponds to the sequence SEQ ID NO: 8. 
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EXAMPLE 5: AMPLIFICATION AND CLONING OF A 
PORTION OF THE MSRV-1 RETROVIRAL GENOME USING A SEQUENCE 
ALREADY IDENTIFIED, IN A SAMPLE OF VIRUS PURIFIED AT THE 
PEAK OF REVERSE TRANSCRIPTASE ACTIVITY 

5 A PCR technique derived from the technique 

published by Frohman (19) was used. The technique derived 
makes it possible, using a specific primer at the 3' end 
of the genome to be amplified , to elongate the sequence 
towards the 5 1 region of the genome to be analysed. This 

10 technical variant is described in the documentation of the 
firm "Clontech Laboratories Inc.", (Palo-Alto California, 
USA) supplied with its product "5 1 -AmpliFINDERTM RACE 
Kit" , which was used on a fraction of virion purified as 
described above. 

15 The specific 3 ' primers used in the kit protocol 

for the synthesis of the cDNA and the PCR amplification 
are, respectively, complementary to the following MSRV-1 
sequences: 

cDNA : TCATCCATGTACCGAAGG (SEQ ID NO: 25) 

20 amplification : ATGGGGTTCCCAAGTTCCCT (SEQ ID NO: 26) 

The products originating from the PCR were 
obtained after purification on agarose gel according to 
conventional methods (17) , and then resuspended in 10 ml 

25 of distilled water. Since one of the properties of Taq 
polymerase consists in adding an adenine at the 3 1 end of 
each of the two DNA strands, the DNA obtained was inserted 
directly into a plasmid using the TA CloningTM kit 
(British Biotechnology) . The 2 ml of DNA solution were 

30 mixed with 5 ml of sterile distilled water, 1 ml of a 10- 
fold concentrated ligation buffer "10X LIGATION BUFFER", 
2 ml of "pCR™ VECTOR" (25 ng/ml) and 1 ml of "TA DNA 
LIGASE". This mixture was incubated overnight at 12 °C. The 
following steps were carried out according to the 

35 instructions of the TA Cloning™ kit (British Bio- 
technology) . At the end of the procedure, the white 
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colonies of recombinant bacteria (white) were picked out 
in order to be cultured and to permit extraction of the 
plasmids incorporated according to the so-called "mini- 
prep" procedure (17). The plasmid preparation from each 
5 recombinant colony was cut with a suitable restriction 
enzyme and analysed on agarose gel. Plasmids possessing an 
insert detected under UV light after staining the gel with 
ethidium bromide were selected for sequencing of the 
insert, after hybridization with a primer complementary to 

10 the Sp6 promoter present on the cloning plasmid of the TA 
Cloning™ Kit. The reaction prior to sequencing was then 
performed according to the method recommended for the use 
of the sequencing kit "Prism ready reaction kit dye 
deoxyterminator cycle sequencing kit" (Applied Biosystems, 

15 ref. 401384), and automatic sequencing was carried out 
with an Applied Biosystems "Automatic Sequencer model 
373 A" apparatus according to the manufacturer's 
instructions. 

This technique was applied first to two 

20 fractions of virion purified as described below on sucrose 
from the "POL-2" isolate produced by the PLI-2 line on the 
one hand, and from the MS7PG isolate produced by the LM7PC 
line on the other hand. The culture supernatants are 
collected twice weekly, precentrif uged at 10,000 rpm for 

25 30 minutes to remove cell debris and then frozen at -80°C 
or used as they are for the following steps. The fresh or 
thawed supernatants are centrifuged on a cushion of 3 0% 
glycerol-PBS at 100,000 g (or 30,000 rpm in a type 45 T 
LKB -HITACHI rotor) for 2 h at 4°C. After removal of the 

30 supernatant, the sedimented pellet is taken up in a small 
volume of PBS and constitutes the fraction of concentrated 
but unpurified virions. The concentrated virus is then 
applied to a sucrose gradient in sterile PBS buffer (15 to 
50% weight/weight) and ultracentrif uged at 35,000 rpm 

35 (100,000 g) for 12 h at +4°C in a swing-out rotor. 
10 fractions are collected, and 20 ml are withdrawn from 
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each fraction after homogenization to assay the reverse 
transcriptase activity therein according to the technique 
described by H. Perron (3) . The fractions containing the 
peak of M LM7-like" RT activity are then diluted in sterile 
5 PBS buffer and ultracentrifuged for one hour at 35,000 rpm 
(100,000 g) to sediment the viral particles. The pellet of 
purified virion thereby obtained is then taken up in a 
small volume of a buffer which is appropriate for the 
extraction of RNA. The cDNA synthesis reaction mentioned 

10 above is carried out on this RNA extracted from purified 
extracellular virion. PCR amplification according to the 
technique mentioned above enabled the clone Fl-ll to be 
obtained, whose sequence, identified by SEQ ID N0:2, is 
presented in Figure 8. 

15 This clone makes it possible to define, with the 

different clones previously sequenced, a region of 
considerable length (1.2 kb) representative of the "pol" 
gene of the MSRV-1 retrovirus, as presented in Figure 9. 
This sequence, designated SEQ ID NO:l, is reconstituted 

20 from different clones overlapping one another at their 
ends, correcting the artefacts associated with the primers 
and with the amplification or cloning techniques which 
would artificially interrupt the reading frame of the 
whole. This sequence will be identified below under the 

25 designation "MSRV-1 pol* region". Its degree of homology 
with the HSERV-9 sequence is shown in Figure 12. 

In Figure 9, the potential reading frame with 
its translation into amino acids is presented below the 
nucleic acid sequence. 

30 

EXAMPLE 6: DETECTION OF SPECIFIC MSRV-1 and 
MSRV-2 SEQUENCES IN DIFFERENT SAMPLES OF PLASMA 
ORIGINATING FROM PATIENTS SUFFERING FROM MS OR FROM 
CONTROLS 

35 A PCR technique was used to detect the MSRV-1 

and MSRV-2 genomes in plasmas obtained after taking blood 
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samples from patients suffering from MS and from non-MS 
controls onto EDTA. 

Extraction of the RNAs from plasma was performed 
according to the technique described by P. Chomzynski 
5 (20), after adding one volume of buffer containing 
guanidinium thiocyanate to 1 ml of plasma stored frozen at 
-80°C after collection. 

For MSRV-2, the PCR was performed under the same 
conditions and with the following primers: 
10 - 5 1 primer, identified by SEQ ID NO: 14 

5 ' GTAGTTCGATGTAGAAAGCG 3 1 ; 

- 3* primer, identified by SEQ ID NO: 15 
5 1 GCATCCGGCAACTGCACG 3 1 . 

However, similar results were also obtained with 
15 the following PCR primers in two successive amplifications 
by "nested" PCR on samples of nucleic acids not treated 
with DNase. 

The primers used for this first step of 
40 cycles with a hybridization temperature of 48 °C are the 
20 following: 

- 5 1 primer, identified by SEQ ID NO: 27 

5 • GCCGATATCACCCGCCATGG 3 • , corresponding to a 
5 1 MSRV-2 PCR primer, for a first PCR on samples from 
patients, 

25 - 3 1 primer, identified by SEQ ID NO:28 

5 ' GCATCCGGCAACTGCACG 3 1 , corresponding to a 3 • 
MSRV-2 PCR primer, for a first PCR on samples from 
patients. 

After this step, 10 ml of the amplification 
30 product are taken and used to carry out a second, 
so-called "nested" PCR amplification with primers located 
within the region already amplified. This second step 
takes place over 3 5 cycles, with a primer hybridization 
("annealing") temperature of 50°C. The reaction volume is 
35 100 ml. 
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The primers used for this second step are the 

following: 

- 5 1 primer, identified by SEQ ID NO: 29 

5 1 CGCGATGCTGGTTGGAGAGC 3 • , corresponding to a 
5 5 f MSRV-2 PCR primer, for a nested PCR on samples from 
patients, 

- 3' primer, identified by SEQ ID N0:30 

5 ' TCTCCACTCCGAATATTCCG 3 f , corresponding to a 
3 f MSRV-2 PCR primer, for a nested PCR on samples from 
10 patients. 

For MSRV-l, the amplification was performed in 
two steps. Furthermore, the nucleic acid sample is treated 
beforehand with DNase, and a control PCR without RT (AMV 
reverse transcriptase) is performed on the two 

15 amplification steps so as to verify that the RT-PCR 
amplification comes exclusively from the MSRV-l RNA. In 
the event of a positive control without RT, the initial 
aliquot sample of RNA is again treated with DNase and 
amplified again. 

20 The protocol for treatment with DNase lacking 

RNAse activity is as follows: the extracted RNA is 
aliquoted in the presence of "RNAse inhibitor" 
(Boehringer-Mannheim) in water treated with DEPC at a 
final concentration of 1 mg in 10 ml; to these 10 ml, 1 ml 

25 of "RNAse-free DNAse" (Boehringer-Mannheim) and 1.2 ml of 

pH 5 buffer containing 0.1 M/l sodium acetate and 5 mM/1 
MgS04 is added; the mixture is incubated for 15 min at 

20°C and brought to 95°C for 1.5 min in a "thermocycler" . 

The first MSRV-l RT-PCR step is performed 

30 according to a variant of the RNA amplification method as 
described in Patent Application No. EP-A-0,569, 272 . In 
particular, the cDNA synthesis step is performed at 42°C 
for one hour; the PCR amplification takes place over 
40 cycles, with a primer hybridization ("annealing") 

35 temperature of 53 °C. The reaction volume is 100 ml. 
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The primers used for this first step are the 

following: 

- 5' primer, identified by SEQ ID NO: 16 
5 1 AGGAGTAAGGAAACCCAACGGAC 3 1 ; 

5 - 3' primer, identified by SEQ ID NO:17 

5 • TAAGAGTTGCACAAGTGCG 3 1 . 

After this step, 10 ml of the amplification 
product are taken and used to carry out a second, so- 
called "nested" PCR amplification with primers located 
10 within the region already amplified. This second step 
takes place over 35 cycles, with a primer hybridization 
("annealing") temperature of 53°C. The reaction volume is 
100 ml. 

The primers used for this second step are the 

15 following: 

- 5 1 primer, identified by SEQ ID NO: 18 
5 » TCAGGGATAGCCCCCATCTAT 3 1 ; 

- 3' primer, identified by SEQ ID NO: 19 
5 1 AACCCTTTGCCACTACATCAATTT 3 1 . 

20 Figures 10 and 11 present the results of PCR in 

the form of photographs under ultraviolet light of 
ethidium bromide- impregnated agarose gels, in which an 
electrophoresis of the PCR amplification products applied 
separately to the different wells was performed. 

25 The top photograph (Figure 10) shows the result 

of specific MSRV-2 amplification. 

Well number 8 contains a mixture of DNA 
molecular weight markers, and wells 1 to 7 represent, in 
order, the products amplified from the total RNAs of 

30 plasmas originating from 4 healthy controls free from MS 
(wells 1 to 4) and from 3 patients suffering from MS at 
different stages of the disease (wells 5 to 7) . 

In this series, MSRV-2 nucleic acid material is 
detected in the plasma of one case of MS out of the 3 

35 tested, and in none of the 4 control plasmas. Other 
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results obtained on more extensive series confirm these 
results. 

The bottom photograph (Figure 11) shows the 
result of specific amplification by MSRV-1 "nested" 
5 RT-PCR: 

well No. 1 contains the PCR product produced 
with water alone, without the addition of AMV reverse 
transcriptase; well No. 2 contains the PCR product 
produced with water alone, with the addition of AMV 

10 reverse transcriptase; well number 3 contains a mixture of 
DNA molecular weight markers; wells 4 to 13 contain, in 
order, the products amplified from the total RNAs 
extracted from sucrose gradient fractions (collected in a 
downward direction) , on which gradient a pellet of virion 

15 originating from a supernatant of a culture infected with 
MSRV-1 and MSRV-2 was centrifuged to equilibrium according 
to the protocol described by H. Perron (13) ; to well 14 
nothing was applied; to wells 15 to 17, the amplified 
products of RNA extracted from plasmas originating from 3 

20 different patients suffering from MS at different stages 
of the disease were applied. 

The MSRV-1 retroviral genome is indeed to be 
found in the sucrose gradient fraction containing the peak 
of reverse transcriptase activity measured according to 

25 the technique described by H. Perron (3) , with a very 
strong intensity (fraction 5 of the gradient, placed in 
well No. 8) . A slight amplification has taken place in the 
first fraction (well No. 4), probably corresponding to RNA 
released by lysed particles which floated at the surface 

30 of the gradient; similarly, aggregated debris has 
sedimented in the last fraction (tube bottom) , carrying 
with it a few copies of the MSRV-1 genome which have given 
rise to an amplification of low intensity. 

Of the 3 MS plasmas tested in this series, MSRV- 

35 1 RNA turned up in one case, producing a very intense 
amplification (well No. 17). 
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In this series, the MSRV-l retroviral RNA 
genome, probably corresponding to particles of 
extracellular virus present in the plasma in extremely 
small numbers , was detected by "nested" RT-PCR in one case 
5 of MS out of the 3 tested. Other results obtained on more 
extensive series confirm these results. 

Furthermore, the specificity of the sequences 
amplified by these PGR techniques may be verified and 
evaluated by the "ELOSA" technique as described by 
10 F. Mallet (21) and in the document FR-A-2,663,040. 

For MSRV-l, the products of the nested PCR 
described above may be tested in two ELOSA systems 
enabling a consensus A and a consensus B+C+D of MSRV-l to 
be detected separately, corresponding to the subfamilies 
15 described in Example 1 and Figures 1 and 2. In effect, the 
sequences closely resembling the consensus B+c+D are to be 
found essentially in the RNA samples originating from 
MSRV-l virions purified from cultures or amplified in 
extracellular biological fluids of MS patients, whereas 
20 the sequences closely resembling the consensus A are 
essentially to be found in normal human cellular DNA. 

The ELOSA/MSRV-1 system for the capture and 
specific hybridization of the PCR products of the 
subfamily A uses a capture oligonucleotide cpVIA with an 
25 amine bond at the 5* end and a biotinylated detection 
oligonucleotide dpVIA having as their sequence, 
respectively: 

- cpVIA identified by SEQ ID NO: 31 

5 » GATCTAGG CCACTTCTCAGGTCCAGS 3 ■ , corresponding 
30 to the ELOSA capture oligonucleotide for the products of 
MSRV-l nested PCR performed with the primers identified by 
SEQ ID NO: 16 and SEQ ID NO: 17, optionally followed by 
amplification with the primers identified by SEQ ID NO: 18 
and SEQ ID NO: 19 on samples from patients; 
35 - dpVIA identified by SEQ ID NO: 32; 
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5 ' CATCTITTTGGICAGGCAITAGC 3 ' , corresponding to 
the ELOSA capture oligonucleotide for the subfamily A of 
the products of MSRV-1 "nested" PCR performed with the 
primers identified by SEQ ID NO: 16 and SEQ ID NO: 17 , 
5 optionally followed by amplification with the primers 
identified by SEQ ID NO: 18 and SEQ ID NO: 19 on samples 
from patients. 

The ELOSA/MSRV-1 system for the capture and 
specific hybridization of the PCR products of the 
10 subfamily B+C+D uses the same biotinylated detection 
oligonucleotide dpVIA and a capture oligonucleotide cpVlB 
with an amine bond at the 5' end having as its sequence: 

- dpVlB identified by SEQ ID NO: 33 

5 ' CTTGAGCCAGTTCTCATACCTGGA 3 1 , corresponding to 
15 the ELOSA capture oligonucleotide for the subfamily B + C 
+ D of the products of MSRV-1 "nested" PCR performed with 
the primers identified by SEQ ID NO: 16 and SEQ ID NO: 17, 
optionally followed by amplification with the primers 
identified by SEQ ID NO: 18 and SEQ ID NO: 19 on samples 
20 from patients. 

This ELOSA detection system enabled it to be 
verified that none of the PCR products thus amplified from 
DNase-treated plasmas of MS patients contained a sequence 
of the subfamily A, and that all were positive with the 
25 consensus of the subfamilies B, C and D. 

For MSRV-2, a similar ELOSA technique was evalu- 
ated on isolates originating from infected cell cultures, 
using the following PCR amplification primers, 

- 5' primer, identified by SEQ ID NO: 34 

30 5' AGTGYTRCCMCARGGCGCTGAA 3*, corresponding to a 

5 1 MSRV-2 PCR primer, for PCR on samples from cultures, 

- 3 1 primer, identified by SEQ ID NO: 35 

5 f GMGGCC AG C AG S AKGTC AT C C A 3', corresponding to a 
3' MSRV-2 PCR primer, for PCR on samples from cultures, 
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and the capture oligonucleotides with an amine 
bond at the 5 1 end cpV2 and the biotinylated detection 
oligonucleotide dpV2 having as their respective sequences: 

- cpV2 identified by SEQ ID NO: 36 
5 5 GGATGCCGCCTATAGCCTCTAC 3 1 , corresponding to an 

ELOSA capture oligonucleotide for the products of MSRV-2 
PCR performed with the primers SEQ ID NO: 34 and SEQ ID 
NO: 35, or optionally with the degenerate primers defined 
by Shih (12) . 
10 - dpV2 identified by SEQ ID NO: 37 

5 1 AAGCCTATCGCGTGCAGTTGCC 3 • , corresponding to 
an ELOSA detection oligonucleotide for the products of 
MSRV-2 PCR performed with the primers SEQ ID NO: 34 and SEQ 
ID NO: 35, or optionally with the degenerate primers 
15 defined by Shih (12) 

This PCR amplification system with a pair of 
primers different from those which were described previ- 
ously for amplification on the samples from patients made 
it possible to confirm the infection with MSRV-2 of in 
20 vitro cultures and of samples of nucleic acids used for 
the molecular biology studies. 

All things considered, the first results of PCR 
detection of the genome of pathogenic and/or infective 
agents show that it is possible that free "virus" may 
25 circulate in the blood stream of patients in an acute, 
virulent phase, outside the nervous system. This is 
compatible with the almost invariable presence of "gaps" 
in the blood-brain barrier of patients in an active phase 
of MS. 

30 

EXAMPLE 7: OBTAINING SEQUENCES OF THE "env" GENE 
OF THE MSRV-1 RETROVIRAL GENOME 

As has already been described in Example 5, a 
PCR technique derived from the technique published by 
35 Frohraan (19) was used. The technique derived makes it 
possible, using a specific primer at the 3 1 end of the 
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genome to be amplified, to elongate the sequence towards 
the 5 1 region of the genome to be analysed. This technical 
variant is described in the documentation of "Clontech 
Laboratories Inc. , (Palo-Alto California, USA) supplied 
5 with its product "5 1 -AmpliFINDER™ RACE Kit", which was 
used on a fraction of virion purified as described above. 

In order to carry out an amplification of the 3 1 
region of the MSRV-l retroviral genome encompassing the 
region of the "env" gene, a study was carried out to 

10 determine a consensus sequence in the LTR regions of the 
same type as those of the defective endogenous retrovirus 
HSERV-9 (18, 24), with which the MSRV-l retrovirus 
displays partial homologies. 

The same specific 3' primer was used in the kit 

15 protocol for the synthesis of the cDNA and the PCR 
amplification; its sequence is as follows: 

GTGCTGATTGGTGTATTTACAATCC (SEQ ID NO 45) 
Synthesis of the complementary DNA (cDNA) and 
unidirectional PCR amplification with the above primer 

20 were carried out in one step according to the method 
described in Patent EP-A-0,569,272. 

The products originating from the PCR were 
extracted after purification of agarose gel according to 
conventional methods (17) , and then resuspended in 10 ml 

25 of distilled water. Since one of the properties of Taq 
polymerase consists in adding an adenine at the 3 ' end of 
each of the two DNA strands, the DNA obtained was inserted 
directly into a plasmid using the TA Cloning™ kit (British 
Biotechnology) . The 2 ml of DNA solution were mixed with 5 

30 ml of sterile distilled water, 1 ml of a 10-fold 
concentrated ligation buffer M 10X LIGATION BUFFER", 2 ml 
of "pCR™ VECTOR" (25 ng/ml) and 1 ml of "TA DNA LIGASE". 
This mixture was incubated overnight at 12 °C. The 
following steps were carried out according to the 

35 instructions of the TA Cloning® kit (British Biotechno- 
logy) . At the end of the procedure, the white colonies of 
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recombinant bacteria (white) were picked out in order to 
be cultured and to permit extraction of the plasmids 
incorporated according to the so-called "miniprep" 
procedure (17). The plasmid preparation from each 
5 recombinant colony was cut with a suitable restriction 
enzyme and analysed on agarose gel. Plasmids possessing an 
insert detected under UV light after staining the gel with 
ethidium bromide were selected for sequencing of the 
insert , after hybridization with a primer complementary to 

10 the Sp6 promoter present on the cloning plasmid of the TA 
Cloning™ Kit. The reaction prior to sequencing was then 
performed according to the method recommended for the use 
of the sequencing kit "Prism ready reaction kit dye 
deoxyterminator cycle sequencing kit" (Applied Biosystems, 

15 ref. 401384), and automatic sequencing was carried out 
with an Applied Biosystems "automatic sequencer, model 
373 A" apparatus according to the manufacturer's 
instructions . 

This technical approach was applied to a sample 

20 of virion concentrated as described below from a mixture 
of culture supernatants produced by B lymphoblastoid lines 
such as are described in Example 2, established from 
lymphocytes of patients suffering from MS and possessing 
reverse transcriptase activity which is detectable 

25 according to the technique described by Perron et al. (3): 
the culture supernatants are collected twice weekly, 
precentrifuged at 10,000 rpm for 3 0 minutes to remove cell 
debris and then frozen at -80 °C or used as they are for 
the following steps. The fresh or thawed supernatants are 

30 centrifuged on a cushion of 30% glycerol-PBS at 100,000 g 
for 2 h at 4°C. After removal of the supernatant, the 
sedimented pellet constitutes the sample of concentrated 
but unpurified virions. The pellet thereby obtained is 
then taken up in a small volume of an appropriate buffer 

35 for the extraction of RNA. The cDNA synthesis reaction 
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mentioned above is carried out on this rna extracted from 
concentrated extracellular virion, 

RT-PCR amplification according to the technique 
mentioned above enabled the clone FBd3 to be obtained, 
5 whose sequence, identified by SEQ ID NO: 46, is presented 
in Figure 13. 

In Figure 14, the sequence homology between the 
clone FBd3 and the HSERV-9 retrovirus is shown on the 
matrix chart by a continuous line for any partial homology 

10 greater than or equal to 65%. It can be seen that there 
are homologies in the flanking regions of the clone (with 
the pol gene at the 5' end and with the env gene and then 
the LTR at the 3 f end), but that the internal region is 
totally divergent and does not display any homology, even 

15 weak, with the "env" gene of HSERV9 . Furthermore, it is 
apparent that the clone FBd3 contains a longer "env" 
region than the one which is described for the defective 
endogenous HSERV-9; it may thus be seen that the internal 
divergent region constitutes an "insert" between the 

20 regions of partial homology with the HSERV-9 defective 
genes. 

EXAMPLE 8: AMPLIFICATION, CLONING AND SEQUENCING 
OF THE REGION OF THE MSRV-1 RETROVIRAL GENOME LOCATED 
25 BETWEEN THE CLONES PSJ17 AND FBd3 

Four oligonucleotides, Fl, B4 , F6 and Bl, were 
defined for amplifying RNA originating from concentrated 
virions of the strains P0L2 and MS7PG. Control reactions 
were performed so as to check for the presence of 

30 contaminants (reaction with water) . The amplification 
consists of a first step of RT-PCR according to the 
protocol described in Patent Application EP-A-0 , 569 , 272 , 
followed by a second step of PCR performed on 10 ml of 
product of the first step with primers internal to the 

35 amplified first region ("nested" PCR) . In the first RT-PCR 
cycle, the primers Fl and B4 are used. In the second PCR 
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cycle, the primers F6 and the primer Bl are used. The 
primers are positioned as follows: 

Fl F6 Bl B4 

RNA 

MSRV-1 



PSJ17 FBd3 

> < /■ 

10 5'pol MSRV-1 3 1 pol MSRV-1 / 
5 'env 



Their composition is: 

primer Fl: TGATGTGAACGGCATACTCACTG (SEQ ID NO: 47) 
15 primer B4: CCCAGAGGTTAGGAACTCCCTTTC (SEQ ID NO 48) 

primer F6: GCTAAAGGAGACTTGTGGTTGTCAG (SEQ ID NO 49) 

primer Bl: CAACATGGGCATTTCGGATTAG (SEQ ID NO 50) 

The product of "nested" amplification obtained 

and designated "t pol" is presented in Figure 15 , and 
20 corresponds to the sequence SEQ ID NO: 51. 

EXAMPLE 9: OBTAINING NEW SEQUENCES, EXPRESSED AS 
RNA IN CELLS IN CULTURE PRODUCING MSRV-1 , AND COMPRISING 
AN "env" REGION OF THE MSRV-1 RETROVIRAL GENOME 

25 A library of cDNA was produced according to the 

procedure described by the manufacturer of the "cDNA 
synthesis module, cDNA rapid adaptator ligation module, 
cDNA rapid cloning module and lambda gtlO in vitro 
packaging module" kits (Amersham, ref RPN1256Y/Z, RPN1712, 

30 RPN1713, RPN1717, N334Z) , from the messenger RNA extracted 
from cells of a B lymphoblastoid line such as is described 
in Example 2, established from the lymphocytes of a 
patient suffering from MS and possessing reverse 
transcriptase activity which is detectable according to 

35 the technique described by Perron et al. (3). 
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Oligonucleotides were defined for amplifying the 
cDNA cloned into the nucleic acid library between the 3" 
region of the clone PSJ17 (pol) and the 5'(LTR) region of 
the clone FBd3 . Control reactions were performed so as to 
5 check for the presence of contaminants (reaction with 
water) • PCR reactions performed on the nucleic acids 
cloned into the library with different pairs of primers 
enabled a series of clones linking pol sequences to the 
MSRV-1 type env or LTR sequences to be amplified. 
10 Two clones are representative of the sequences 

obtained in the cellular cDNA library: 

- the clone JLBcl, whose sequence SEQ ID NO: 52 is pre- 
sented in Figure 16; 

- the clone JLBc2, whose sequence SEQ ID NO: 53 is pre- 
15 sented in Figure 17. 

The sequences of the clones JLBcl and JLBc2 are 
homologous to that of the clone FBd3, as is apparent in 
Figures 18 and 19. The homology between the clone JLBcl 
and the clone JLBc2 is shown in Figure 20. 

20 The homologies between the clones JLBcl and 

JLBc2 on the one hand and the HSERV9 sequence on the other 
hand are presented , respectively , in Figures 21 and 22. 

It will be noted that the region of homology 
between JLB1 , JLB2 and FBd3 comprises, with a few sequence 

25 and size variations of the "insert", the additional 
sequence absent ("inserted") in the HSERV-9 env sequence, 
as described in Example 8. 

It will also be noted that the cloned "pol" 
region is very homologous to HSERV-9, does not possess a 

30 reading frame (bearing in mind the sequence errors induced 
by the techniques used, including even the automatic 
sequencer) and diverges from the MSRV-1 sequences obtained 
from virions. In view of the fact that these sequences 
were cloned from the RNA of cells expressing MSRV-1 

35 particles, it is probable that they originate from 
endogenous retroviral elements related to the ERV9 family; 
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this is all the more likely for the fact that the pol and 
env genes are present on the same RNA which is clearly not 
the MSRV-1 genomic RNA. Some of these ERV9 elements 
possess functional LTRs which can be activated by 
5 replicative viruses coding for homologous or heterologous 
transactivators. Under these conditions, the relationship 
between MSRV-1 and HSERV-9 makes probable the 
transactivation of the defective (or otherwise) endogenous 
ERV9 elements by homologous, or even identical, MSRV-1 

10 transactivating proteins. 

Such a phenomenon may induce a viral interfer- 
ence between the expression of MSRV-1 and the related 
endogenous elements. Such an interference generally leads 
to a so-called "defective-interfering" expression, some 

15 features of which were to be found in the MSRV-1- infected 
cultures studied. Furthermore, such a phenomenon does not 
lack generation of the expression of polypeptides, or even 
of endogenous retroviral proteins which are not 
necessarily tolerated by the immune system. Such a scheme 

20 of aberrant expression of endogenous elements related to 
MSRV-1 and induced by the latter is liable to multiply the 
aberrant antigens, and hence to contribute to the 
induction of autoimmune processes such as are observed in 
MS. 

25 It is, however, essential to note that the 

clones JLBcl and JLBc2 differ from the ERV9 or HSERV9 
sequence already described, in that they possess a longer 
env region comprising an additional region totally 
divergent from ERV9. Their kinship with the endogenous 

30 ERV9 family may hence be defined, but they clearly 
constitute novel elements never hitherto described. In 
effect, interrogation of the data banks of nucleic acid 
sequences available in version No. 15 (1995) of the 
"Entrez" software (NCBI, NIH, Bethesda, USA) did not 

35 enable a known homologous sequence in the env region of 
these clones to be identified. 
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EXAMPLE 10: OBTAINING SEQUENCES LOCATED IN THE 
5' pol AND 3' gag REGION OP THE MSRV-1 RETROVIRAL GENOME 

As has already been described in Example 5, a 
5 PCR technique derived from the technique published by 
Frohman (19) was used. The technique derived makes it 
possible, using a specific primer at the 3» end of the 
genome to be amplified, to elongate the sequence towards 
the 5 1 region of the genome to be analysed. This technical 

10 variant is described in the documentation of the firm 
Clontech Laboratories Inc., (Palo-Alto California, USA) 
supplied with its product "5 1 -Amp li FINDER™ RACE Kit", 
which was used on a fraction of virion purified as 
described above. 

15 In order to carry out an amplification of the 5' 

region of the MSRV-1 retroviral genome starting from the 
pol sequence already sequenced (clone Fll-1) and extending 
towards the gag gene, MSRV-1 specific primers were 
defined. 

20 The specific 3 1 primers used in the kit protocol 

for the synthesis of the cDNA and the PCR amplification 
are, respectively, complementary to the following MSRV-1 
sequences : 

CDNA: (SEQ ID NO: 54) 

CCTGAGTTCTTGCACTAACCC 
amplification: (SEQ ID NO: 55) 
GTCCGTTGGGTTTCCTTACTCCT 

The products originating from the PCR were 
extracted after purification on agarose gel according to 
conventional methods (17), and then resuspended in 10 ml 
of distilled water. Since one of the properties of Taq 
polymerase consists in adding an adenine at the 3 1 end of 
each of the two DNA strands, the DNA obtained was inserted 
directly into a plasmid using the TA Cloning™ kit (British 
Biotechnology) . The 2 ml of DNA solution were mixed with 5 
ml of sterile distilled water, 1 ml of a 10-fold 
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concentrated ligation buffer "10X LIGATION BUFFER", 2 ml 
of "pCR™ VECTOR" (25 ng/ml) and 1 ml of "TA DNA LIGASE" . 
This mixture was incubated overnight at 12 °C. The 
following steps were carried out according to the 
5 instructions of the TA Cloning® kit (British 
Biotechnology) . At the end of the procedure, the white 
colonies of recombinant bacteria (white) were picked out 
in order to be cultured and to permit extraction of the 
plasmids incorporated according to the so-called 

10 "miniprep" procedure (17). The plasmid preparation from 
each recombinant colony was cut with a suitable 
restriction enzyme and analysed on agarose gel. Plasmids 
possessing an insert detected under UV light after 
staining the gel with ethidium bromide were selected for 

15 sequencing of the insert, after hybridization with a 
primer complementary to the Sp6 promoter present on the 
cloning plasmid of the TA Cloning™ Kit. The reaction prior 
to sequencing was then performed according to the method 
recommended for the use of the sequencing kit "Prism ready 

20 reaction kit dye deoxyterminator cycle sequencing kit" 
(Applied Biosystems, ref. 401384), and automatic 
sequencing was carried out with an Applied Biosystems 
"automatic sequencer model 373 A" apparatus according to 
the manufacturers instructions. 

25 This technical approach was applied to a sample 

of virion concentrated as described below from a mixture 
of culture supernatants produced by B lymphoblastoid lines 
such as are described in Example 2, established from 
lymphocytes of patients suffering from MS and possessing 

30 reverse transcriptase activity which is detectable 
according to the technique described by Perron et al. (3): 
the culture supernatants are collected twice weekly, 
precentrifuged at 10,000 rpm for 30 minutes to remove cell 
debris and then frozen at -80°C or used as they are for 

35 the following steps. The fresh or thawed supernatants are 
centrifuged on a cushion of 30% glycerol-PBS at 100,000 g 
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for 2 h at 4°C. After removal of the supernatant, the 
sedimented pellet constitutes the sample of concentrated 
but unpurified virions. The pellet thereby obtained is 
then taken up in a small volume of an appropriate buffer 
5 for the extraction of RNA. The cDNA synthesis reaction 
mentioned above is carried out on this RNA extracted from 
concentrated extracellular virion. 

RT-PCR amplification according to the technique 
mentioned above enabled the clone GM3 to be obtained, 
whose sequence, identified by SEQ ID NO 56, is presented 
in Figure 23. 

In Figure 24, the sequence homology between the 
clone GMP3 and the HSERV-9 retrovirus is shown on the 
matrix chart by a continuous line, for any partial 
homology greater than or equal to 65%. 

In summary, Figure 25 shows the localization of 
the different clones studied above, relative to the known 
ERV9 genome. In Figure 25, since the MSRV-1 env region is 
longer than the reference ERV9 env gene, the additional 
region is shown above the point of insertion according to 
a "V", on the understanding that the inserted material 
displays a sequence and size vari-ability between the 
clones shown (JLBcl, JLBc2, FBd3) . And Figure 26 shows the 
position of different clones studied in the MSRV-1 pol* 
region. 

By means of the clone GM3 described above, a 
possible reading frame could be defined, covering the 
whole of the pol gene, referenced according to SEQ ID 
NO: 57, shown in the successive Figures 27a to 27c. 

EXAMPLE 11: DETECTION OF ANTI-MSRV-1 SPECIFIC 
ANTIBODIES IN HUMAN SERUM 

Identification of the sequence of the pol gene 

of the MSRV-1 retrovirus and of an open reading frame of 

this gene enabled the amino acid sequence SEQ ID NO: 39 of 
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a region of the said gene, referenced SEQ ID NO: 40 , to be 
determined (see Figure 28). 

Different synthetic peptides corresponding to 
fragments of the protein sequence of MSRV-l reverse 
5 transcriptase encoded by the pol gene were tested for 
their antigenic specificity with respect to sera of 
patients suffering from MS and of healthy controls. 

The peptides were synthesized chemically by 
solid-phase synthesis according to the Merrifield tech- 

10 nique (Barany G, and Merrifielsd R.B, 1980, In the 
Peptides, 2, 1-284, Gross E and Meienhofer J, Eds., 
Academic Press, New York). The practical details are those 
described below. 

a) Peptide synthesis: 

15 The peptides were synthesized on a phenylacet- 

amidomethyl (PAM) /polystyrene/divinylbenzene resin 

(Applied Biosystems, Inc. Foster City, CA) , using an 
"Applied Biosystems 430A" automatic synthesizer. The amino 
acids are coupled in the form of hydroxybenzotriazole 

20 (HOBT) esters. The amino acids used are obtained from 
Novabiochem (Lauf lerlf ingen, Switzerland) or Bachem 
(Bubendorf , Switzerland) . 

The chemical synthesis was performed using a 
double coupling protocol with N-methylpyrrolidone (NMP) as 

25 solvent. The peptides were cut from the resin, as well as 
the side-chain protective groups, simultaneously, using 
hydrofluoric acid (HF) in a suitable apparatus (type I 
cleavage apparatus, Peptide Institute, Osaka, Japan) . 

For 1 g of peptidyl resin, 10 ml of HF, 1 ml of 

30 anisole and 1 ml of dimethyl sulphide 5DMS are used. The 
mixture is stirred for 45 minutes at -2°C. The HF is then 
evaporated off under vacuum. After intensive washes with 
ether, the peptide is eluted from the resin with 10% 
acetic acid and then lyophilized. 

35 The peptides are purified by preparative high 

performance liquid chromatography on a VYDAC C18 type 
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column (250 x 21 mm) (The Separation Group , Hesperia, CA, 
USA) . Elution is carried out with an acetonitrile gradient 
at a flow rate of 22 ml/min. The fractions collected are 
monitored by an elution under isocratic conditions on a 
5 VYDAC® C18 analytical column (250 x 4.6 mm) at a flow rate 
of 1 ml/min. Fractions having the same retention time are 
pooled and lyophilized. The preponderant fraction is then 
analysed by analytical high performance liquid 
chromatography with the system described above. The 

10 peptide which is considered to be of acceptable purity 
manifests itself in a single peak representing not less 
than 95% of the chromatogram. 

The purified peptides are then analysed with the 
object of monitoring their amino acid composition , using 

15 an Applied Biosystems 420H automatic amino acid analyser. 
Measurement of the (average) chemical molecular mass of 
the peptides is obtained using LSIMS mass spectrometry in 
the positive ion mode on a VG. ZAB.ZSEQ double focusing 
instrument connected to a DEC-VAX 2000 acquisition system 

20 (VG analytical Ltd, Manchester, England) . 

The reactivity of the different peptides was 
tested against sera of patients suffering from MS and 
against sera of healthy controls. This enabled a peptide 
designated P0L2B to be selected, whose sequence is shown 

25 in Figure 28 in the identifier SEQ ID NO: 39, below, 
encoded by the pol gene of MSRV-1 (nucleotides 181 to 
330) . 

b) Antigenic properties: 

The antigenic properties of the POL2B peptide 
30 were demonstrated according to the ELISA protocol 
described below. 

The lyophilized P0L2B peptide was dissolved in 
sterile distilled water at a concentration of 1 mg/ml. 
This stock solution was aliquoted and kept at +4°C for use 
35 over a fortnight, or frozen at -20 °c for use within 2 
months. An aliquot is diluted in PBS (phosphate buffered 
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saline) solution so as to obtain a final peptide 
concentration of 1 microgram/ml . 100 microlitres of this 
dilution are placed in each well of microtitration plates 
("high-binding" plastic, COSTAR ref : 3590) . The plates are 
5 covered with a "plate-sealer" type adhesive and kept 
overnight at +4°C for the phase of adsorption of the 
peptide to the plastic. The adhesive is removed and the 
plates are washed three times with a volume of 300 micro- 
litres of a solution A (IX PBS, 0.05% Tween 20®), then 
inverted over an absorbent tissue. The plates thus drained 
are filled with 200 microlitres per well of a solution B 
(solution A + io% of goat serum) , then covered with an 
adhesive and incubated for 45 minutes to 1 hour at 37°C. 
The plates are then washed three times with the solution A 
as described above. 

The test serum samples are diluted beforehand to 
1/50 in the solution B, and 100 microlitres of each dilute 
test serum are placed in the wells of each microtitration 
plate. A negative control is placed in one well of each 
plate, in the form of 100 microlitres of buffer B. The 
plates covered with an adhesive are then incubated for 1 
to 3 hours at 37°C. The plates are then washed three times 
with the solution A as described above. In parallel, a 
peroxidase-labelled goat antibody directed against human 
IgG (Sigma Immunochemicals ref. A6029) or IgM (Cappel ref. 
55228) is diluted in the solution B (dilution 1/5000 for 
the anti-IgG and 1/1000 for the anti-IgM) . 100 microlitres 
of the appropriate dilution of the labelled antibody are 
then placed in each well of the microtitration plates, and 
the plates covered with an adhesive are incubated for 1 to 
2 hours at 37 °C. A further washing of the plates is then 
performed as described above. In parallel, the peroxidase 
substrate is prepared according to the directions of the 
"Sigma fast OPD kit" (Sigma Immunochemicals, ref. P9187) . 
100 microlitres of substrate solution are placed in each 
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well, and the plates are placed protected from light for 
20 to 30 minutes at room temperature. 

When the colour reaction has stabilized, the 
plates are placed immediately in an ELISA plate 
5 spectrophotometry reader, and the optical density (OD) of 
each well is read at a wavelength of 492 nm. Alter- 
natively, 30 microlitres of IN HCl are placed in each well 
to stop the reaction, and the plates are read in the 
spectrophotometer within 24 hours. 

10 The serological samples are introduced in dupli- 

cate or in triplicate, and the optical density (OD) 
corresponding to the serum tested is calculated by taking 
the mean of the OD values obtained for the same sample at 
the same dilution. 

15 The net OD of each serum corresponds to the mean 

OD of the serum minus the mean OD of the negative control 
(solution B: PBS, 0.05% Tween 20®, 10% goat serum). 

c) Detection of anti-MSRV-l igG antibodies by 

ELISA: 

20 The technique described above was used with the 

POLB2 peptide to test for the presence of anti-MSRV-1 
specific. IgG antibodies in the serum of 29 patients for 
whom a definite or probable diagnosis of MS was estab- 
lished according to the criteria of Poser (23) , and of 32 

25 healthy controls (blood donors) . 

Figure 29 shows the results for each serum 
tested with an anti-IgG antibody. Each vertical bar 
represents the net optical density (OD at 492 nm) of a 
serum tested. The ordinate axis gives the net OD at the 

30 top of the vertical bars. The first 29 vertical bars lying 
to the left of the vertical broken line represent the sera 
of 29 cases of MS tested, and the 32 vertical bars lying 
to the right of the vertical broken line represent the 
sera of 32 healthy controls (blood donors) . 

35 The mean of the net OD values for the MS sera 

tested is 0.62. The diagram enables 5 controls to be 



5/9/2006, EAST Version: 2.0.3.0 



WO 98/23755 



PCT/IB97/01482 



66 

revealed whose net OD rises above the grouped values of 
the control population. These values may represent the 
presence of specific IgGs in symptomless seropositive 
patients. Two methods were hence evaluated in order to 
5 determine the statistical threshold of positivity of the 
test. 

The mean of the net OD values for the controls, 
including the controls with high net OD values, is 0.36. 
Without the 5 controls whose net OD values are greater 
than or equal to 0.5, the mean of the "negative" controls 
is 0.33. The standard deviation of the negative controls 
is 0.10. A theoretical threshold of positivity may be 
calculated according to the formula: 

threshold value (mean of the net OD values of the 
seronegative controls) + (2 or 3 x standard deviation of 
the net OD values of the seronegative controls) . 

In the first case, there are considered to be 
symptomless seropositives, and the threshold value is 
equal to 0.33 + (2 x 0.10) = 0.53. The negative results 
represent a non-specific "background" of the presence of 
antibodies directed specifically against an epitope of the 
peptide. 

In the second case, if the set of controls 
consisting of blood donors in apparent good health is 
taken as a reference basis, without excluding the sera 
which are, on the face of it, seropositive, the standard 
deviation of the "non-MS controls" is 0.116. The threshold 
value then becomes 0.3 6 + (2 x 0.116) = 0.59. 

According to this analysis, the test is specific 
for MS. In this respect, it is seen that the test is 
specific for MS, since, as shown in Table 1, no control 
has a net OD above this threshold. In fact, this result 
reflects the fact that the antibody titres in patients 
suffering from MS are, for the most part, higher than in 
healthy controls who have been in contact with MSRV-1. 
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TABLE NO. 1 



MS 

0.681 

1.0425 

0.5675 

0.63 

0.588 

0.645 

0.6635 

0.576 

0.7765 

0.5745 

0.513 

0.4325 

0.7255 

0.859 

0.6435 

0.5795 

0.8655 

0.671 

0.596 

0.662 

0.602 

0.525 

0.53 

0.565 

0.517 

0.607 

0.3705 

0.397 

0.4395 



CONTROLS 

0.3515 

0.56 

0.3565 

0.449 

0.2825 

0.55 

0.52 

0.2535 

0.55 

0.51 

0.426 

0.451 

0.227 

0.3905 

0.265 

0.4295 

0.291 

0.347 

0.4495 

0.3725 

0.181 

0.2725 

0.426 

0.1915 

0.222 

0.395 

0.34 

0.307 

0.219 

0.491 

0.2265 

0.2605 



MEAN 0.62 0.33 
STD DEV 0.14 0.10 
THRESHOLD VALUE 0.53 
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In accordance with the first method of calcula- 
tion, and as shown in Figure 29 and in the corresponding 
Table 1, 26 of the 29 MS sera give a positive result (net 
OD greater than or equal to 0.50) , indicating the presence 
5 of IgGs specifically directed against the POL2B peptide, 
hence against a portion of the reverse transcriptase 
enzyme of the MSRV-1 retrovirus encoded by its pol gene, 
and consequently against the MSRV-1 retrovirus. Thus, 
approximately 90% of the MS patients tested have reacted 
10 against an epitope carried by the P0L2B peptide and 
possess circulating IgGs directed against the latter. 

Five out of 32 blood donors in apparent good 
health show a positive result. Thus, it is apparent that 
approximately 15% of the symptomless population may have 
15 been in contact with an epitope carried by the POL2B 
peptide under conditions which have led to an active 
immunization which manifests itself in the persistence of 
specific serum IgGs. These conditions are compatible with 
an immunization against the MSRV-1 retrovirus reverse 
20 transcriptase during an infection with (and/or reactiva- 
tion of) the MSRV-1 retrovirus. The absence of apparent 
neurological pathology recalling MS in these seropositive 
controls may indicate that they are healthy carriers and 
have eliminated an infectious virus after immunizing 
25 themselves, or that they constitute an at-risk population 
of chronic carriers. In effect, epidemiological data 
showing that a pathogenic agent present in the environment 
of regions of high prevalence of MS may be the cause of 
this disease imply that a fraction of the population free 
30 from MS has necessarily been in contact with such a 
pathogenic agent. It has been shown that the MSRV-1 
retrovirus constitutes all or part of this "pathogenic 
agent" at the source of MS, and it is hence normal for 
controls taken from a healthy population to possess IgG 
type antibodies against components of the MSRV-l 
retrovirus. Thus, the difference in seroprevalence between 
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the MS and control populations is extremely significant: 
"chi-squared" test, p < 0.001. These results hence point 
to an aetiopathogenic role of MSRV-1 in MS. 

d) Detection of anti-MSRV-1 IgM antibodies by 

5 ELISA: 

The ELISA technique with the P0L2B peptide was 
used to test for the presence of anti-MSRV-l IgM specific 
antibodies in the serum of 36 patients for whom a definite 
or probable diagnosis of MS was established according to 

10 the criteria of Poser (23), and of 42 healthy controls 
(blood donors) . 

Figure 30 shows the results for each serum tested 
with an anti-IgM antibody. Each vertical bar represents 
the net optical density (OD at 492 nm) of a serum tested. 

15 The ordinate axis gives the net OD at the top of the 
vertical bars. The first 36 vertical bars lying to the 
left of the vertical line cutting the abscissa axis 
represent the sera of 36 cases of MS tested, and the 
vertical bars lying to the right of the vertical broken 

20 line represent the sera of 42 healthy controls (blood 
donors) . The horizontal line drawn in the middle of the 
diagram represents a theoretical threshold defining the 
boundary of the positive results (in which the top of the 
bar lies above) and the negative results (in which the top 

25 of the bar lies below) . 

The mean of the net OD values for the MS cases 
tested is 0.19. 

The mean of the net OD values for the controls 

is 0.09. 

30 The standard deviation of the negative controls 

is 0.05. 

In view of the small difference between the mean 
and the standard deviation of the controls, the threshold 
of theoretical positivity may be calculated according to 
35 the formula: 
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threshold value = (mean of the net OD values of 
the seronegative controls) + (3 x standard deviation of 
the net OD values of the seronegative controls) . 

5 The threshold value is hence equal to 0.09 + 

(3 x 0.05) = 0.26; or, in practice, 0.25. 

The negative results represent a non-specific 
"background" of the presence of antibodies directed 
specifically against an epitope of the peptide. 

According to this analysis, and as shown in 
Figure 30 and in the corresponding Table 2, the IgM test 
is specific for MS, since no control has a net OD above 
the threshold. 7 of the 36 MS sera produce a positive IgM 
result; now, a study of the clinical data reveals that 
these positive sera were taken during a first attack of MS 
or an acute attack in untreated patients. It is known that 
IgMs directed against pathogenic agents are produced 
during primary infections or during reactivations follow- 
ing a latency phase of the said pathogenic agent. 

The difference in seroprevalence between the MS 
and control populations is extremely significant: 
"chi-squared" test, p < 0.001. 

These results point to an aetiopathogenic role 
of MSRV-1 in MS. 

The detection of IgM and IgG antibodies against 
the P0L2B peptide enables the course of an MSRV-1 infec- 
tion and/or of the viral reactivation of MSRV-1 to be 
evaluated. 
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TABLE No, 2 



MS 

0.064 

0.087 

0. 044 

0.115 

0.089 

0.025 

0.097 

0.108 

0.018 

0.234 

0.274 

0.225 

0.314 

0.522 

0.306 

0.143 

0.375 

0.142 

0.157 

0.168 

1.051 

0.104 

0.187 

0.044 

0.053 

0.153 

0.07 

0.033 

0.104 

0.187 

0.044 

0.053 

0.153 

0. 07 

0.033 

0.973 



CONTROLS 

0.243 
0.11 
0.098 
0.028 

0.094 

0.038 

0.176 

0.146 

0.049 

0.161 

0.113 

0.079 

0.093 

0.127 

0.02 

0.052 

0.062 

0.074 

0.043 

0.046 

0.041 

0.13 

0.153 

0.107 

0.178 

0.114 

0.078 

0.118 

0.177 

0.026 

0.024 

0.046 

0.116 

0.04 

0.028 

0.073 

0.008 

0.074 

0.141 

0.219 

0 . 047 

0.017 



MEAN 0.19 0.09 

STD. DEV. 0.23 0.05 

THRESHOLD VALUE 0.26 
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e) Search for immunodominant epitopes in the 
P0L2B peptide: 

In order to reduce the non-specific background 
and to optimize the detection of the responses of the 
5 anti-MSRV-1 antibodies, the synthesis of octapeptides, 
advancing in successive one amino acid steps, covering the 
whole of the sequence determined by P0L2B, was carried out 
according to the protocol described below. 

The chemical synthesis of overlapping octapep- 
10 tides covering the amino acid sequence 61-110 shown in the 
identifier SEQ ID NO: 39 was carried out on an activated 
cellulose membrane according to the technique of BERG et 
al. (1989. J. Ann. Chem. Soc. , 111, 8024-8026) marketed by 
Cambridge Research Biochemicals under the trade name 
15 Spotscan. This technique permits the simultaneous 
synthesis of a large number of peptides and their 
analysis. 

The synthesis is carried out with esterified 
amino acids in which the a-amino group is protected with 

20 an FMOC group (Nova Biochem) and the side-chain groups 
with protective groups such as trityl, t-butyl ester or t- 
butyl ether. The esterified amino acids are solubilized in 
N-methylpyrrolidone (NMP) at a concentration of 300 nM, 
and 0.9 ml are applied to spots of deposit of bromophenol 

25 blue. After incubation for 15 minutes, a further 
application of amino acids is carried out according to 
another 15-minute incubation. If the coupling between two 
amino acids has taken place correctly, a coloration 
modification (change from blue to yellow-green) is 

30 observed. After three washes in DMF, an acetylation step 
is performed with acetic anhydride. Next, the terminal 
amino groups of the peptides in the process of synthesis 
are deprotected with 20% pyridine in DMF. The spots of 
deposit are restained with a 1% solution of bromophenol 

35 blue in DMF, washed three times with methanol and dried. 
This set of operations constitutes one cycle of addition 
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of an amino acid, and this cycle is repeated until the 
synthesis is complete. When all the amino acids have been 
added, the NH2 -terminal group of the last amino acid is 
deprotected with 20% piper idine in DMF and acetylated with 
5 acetic anhydride. The groups protecting the side chain are 
removed with a dichloromethane/trif luoroacetic 

acid/triisobutylsilane (5 ml/5 ml/250 ml) mixture. The 
immunoreactivity of the peptides is then tested by ELISA. 

After synthesis of the different octapeptides in 

10 duplicate on two different membranes, the latter are 
rinsed with methanol and washed in TBS (0.1M Tris pH 7.2), 
then incubated overnight at room temperature in a 
saturation buffer. After several washes in TBS-T (0.1M 
Tris pH 7.2 - 0.05% Tween 20), one membrane is incubated 

15 with a 1/50 dilution of a reference serum originating from 
a patient suffering from MS, and the other membrane with a 
1/50 dilution of a pool of sera of healthy controls. The 
membranes are incubated for 4 hours at room temperature. 
After washes with TBS-T, a P-galactosidase-labelled anti- 

20 human immunoglobulin conjugate (marketed by Cambridge 
Research Biochemicals) is added at a dilution of 1/200, 
and the mixture is incubated for two hours at room 
temperature. After washes of the membranes with 0.05% TBS- 
T and PBS, the immunoreactivity in the different spots is 

25 visualized by adding 5-bromo-4-chloro-3-indolyl p-D- 
galactopyranoside in potassium. The intensity of 
coloration of the spots is estimated qualitatively with a 
relative value from 0 to 5 as shown in the attached 
Figures 31 to 33. 

30 In this way, it is possible to determine two 

immunodominant regions at each end of the POL2B peptide, 
corresponding, respectively, to the amino acid sequences 
65-75 (SEQ ID N0:41) and 92-109 (SEQ ID NO:42), according 
to Figure 34, and lying, respectively, between the 

3 5 octapeptides Phe-Cys-I le-Pro-Va 1 -Arg-Pro-Asp ( FCIPVRPD) 
and Arg-Pro-Asp-Ser-Gln-Phe-Leu-Phe (RPDSQFLF) , and 
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Thr-Val-Leu-Pro-Gln-Gly-Phe-Arg (TVLPQGFR) and Leu-Phe- 
Gly-Gln-Ala-Leu-Ala-Gln (LFGQALAQ) , and a region which is 
less reactive but apparently more specific, since it does 
not produce any background with the control serum , 
5 represented by the octapeptides Leu-Phe-Ala-Phe-Glu-Asp- 
Pro-Leu (LFAFEDPL) (SEQ ID NO: 43) and Phe-Ala-Phe-Glu-Asp- 
Pro-Leu-Asn (FAFEDPLN) (SEQ ID NO: 44). 

These regions make it possible to define new 
peptides which are more specific and more immunoreactive 

10 according to the usual techniques. 

It is thus possible, as a result of the 
discoveries made and the methods developed by the inven- 
tors, to carry out a diagnosis of MSRV-l infection and/or 
reactivation and to evaluate a therapy in MS on the basis 

15 of its efficacy in "negativing" the detection of these 
agents in the patients 1 biological fluids. Furthermore, 
early detection in individuals not yet displaying neuro- 
logical signs of MS could make it possible to institute a 
treatment which would be all the more effective with 

20 respect to the subsequent clinical course for the fact 
that it would precede the lesion stage which corresponds 
to the onset of neurological disorders. Now, at the 
present time, a diagnosis of MS cannot be established 
before a symptomatology of neurological lesions has set 

25 in, and hence no treatment is instituted before the 
emergence of a clinical picture suggestive of lesions of 
the central nervous system which are already significant. 
The diagnosis of an MSRV-l and/or MSRV-2 infection and/or 
reactivation in man is hence of decisive importance, and 

30 the present invention provides the means of doing this. 

It is thus possible, apart from carrying out a 
diagnosis of MSRV-l infection and/or reactivation, to 
evaluate a therapy in MS on the basis of its efficacy in 
"negativing" the detection of these agents in the 

35 patients 1 biological fluids. 
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EXAMPLE 12: OBTAINING A CLONE LB19 CONTAINING A 
PORTION OF THE gag GENE OF THE MSRV-1 RETROVIRUS 

A PCR technique derived from the technique 
published by Gonzalez-Quintial R et al. (19) and PLAZA et 
5 al. (25) was used. From the total RNAs extracted from a 
fraction of virion purified as described above, the cDNA 
was synthesized using a specific primer (SEQ ID No. 64) at 
the 3 1 end of the genome to be amplified, using EXPAND™ 
REVERSE TRANSCRIPTASE (BOEHRINGER MANNHEIM) . 

10 

CDNA: 

AAGGGGCATG GACGAGGTGG TGGCTTATTT (SEQ ID NO: 65) 
(antisense) 

15 After purification, a poly(G) tail was added at 

the 5 1 end of the cDNA using the "Terminal transferases 
kit" marketed by the company Boehringer Mannheim, 
according to the manufacturer's protocol. 

An anchoring PCR was carried out using the 

20 following 5* and 3 1 primers: 

AGATCTGCAG AATTCGATAT CACCCCCCCC CCCCCC (SEQ ID No. 91) 
(sense) , and AAATGTCTGC GG CACCAATC TCCATGTT 

(SEQ ID No. 64) (antisense) 

Next, a semi-nested anchoring PCR was carried 

25 out with the following 5' and 3 1 primers: 

AGATCTGCAG AATTCGATAT CA (SEQ ID No. 92) (sense), and 

AAATGTCTGC GGCACCAATC TCCATGTT (SEQ ID No. 64) (antisense) 

The products originating from the PCR were 
purified after purification on agarose gel according to 

30 conventional methods (17), and then resuspended in 
10 microlitres of distilled water. Since one of the 
properties of Taq polymerase consists in adding an adenine 
at the 3 1 end of each of the two DNA strands, the DNA 
obtained was inserted directly into a plasmid using the TA 

35 Cloning™ kit (British Biotechnology) . The 2 pi of DNA 
solution were mixed with 5 /xl of sterile distilled water, 
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1 Ml of 10-fold concentrated ligation buffer "10x LIGATION 
BUFFER" , 2 Ml of "pCR™ VECTOR" (25 ng/ml) and 1 /il of "T4 
DNA LIGASE". This mixture was incubated overnight at 12°C. 
The following steps were carried out according to the 
5 instructions of the TA Cloning™ kit (British 
Biotechnology). At the end of the procedure, the white 
colonies of recombinant bacteria (white) were picked out 
in order to be cultured and to permit extraction of the 
plasmids incorporated according to the so-called 

10 "miniprep" procedure (17). The plasmid preparation from 
each recombinant colony was cut with a suitable 
restriction enzyme and analysed on agarose gel. Plasmids 
possessing an insert detected under UV light after 
staining the gel with ethidium bromide were selected for 

15 sequencing of the insert, after hybridization with a 
primer complementary to the Sp6 promoter present on the 
cloning plasmid of the TA Cloning Kit™. The reaction prior 
to sequencing was then performed according to the method 
recommended for the use of the sequencing kit "Prism ready 

20 reaction kit dye deoxyterminator cycle sequencing kit" 
(Applied Biosystems, ref. 401384), and automatic 
sequencing was carried out with an Applied Biosystems 
"Automatic Sequencer, model 373 A" apparatus according to 
the manufacturer's instructions. 

25 PCR amplification according to the technique 

mentioned above was used on a cDNA synthesized from the 
nucleic acids of fractions of infective particles purified 
on a sucrose gradient, according to the technique 
described by H. Perron (13), from culture supernatants of 

30 B lymphocytes of a patient suffering from MS, immortalized 
with Epstein-Barr virus (EBV) strain B95 and expressing 
retroviral particles associated with reverse transcriptase 
activity as described by Perron et al. (3) and in French 
Patent Applications MS 10, 11 and 12. the clone LB19, 

35 whose sequence, identified by SEQ ID NO: 59, is presented 
in Figure 35. 
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The clone makes it possible to define, with the 
clone GM3 previously sequenced and the clone G+E+A (see 
Example 15) , a region of 690 base pairs representative of 
a significant portion of the gag gene of the MSRV-1 
retrovirus, as presented in Figure 36. This sequence 
designated SEQ ID NO: 88 is reconstituted from different 
clones overlapping at their ends. This sequence is 
identified under the name MSRV-1 "gag*" region. In Figure 
36, a potential reading frame with the translation into 
amino acids is presented below the nucleic acid sequence. 

EXAMPLE 13: OBTAINING A CLONE FBdl3 CONTAINING A 
pol GENE REGION RELATED TO THE MSRV-1 RETROVIRUS AND AN 
APPARENTLY INCOMPLETE ENV REGION CONTAINING A POTENTIAL 
READING FRAME (ORF) FOR A GLYCOPROTEIN 

Extraction of viral RNAs: The RNAs were 
extracted according to the method briefly described below. 

A pool of culture supernatant of B lymphocytes 
of patients suffering from MS (650 ml) is centrifuged for 
30 minutes at 10,000 g. The viral pellet obtained is 
resuspended in 300 microlitres of PBS/10 mM MgCl 2 . The 

material is treated with a DNAse (100 mg/ml)/RNAse 
(50 mg/ml) mixture for 30 minutes at 37 °C and then with 
proteinase K (50 mg/ml) for 30 minutes at 46°C. 

The nucleic acids are extracted with one volume 
of a phenol/0.1% SDS (V/V) mixture heated to 60°C, and 
then re-extracted with one volume of phenol/chloroform 
(l:l; V/V). 

Precipitation of the material is performed with 
2.5 V of ethanol in the presence of 0.1 V of sodium 
acetate pH5.2. The pellet obtained after centrif ugation is 
resuspended in 50 microlitres of sterile DEPC water. 

The sample is treated again with 50 mg/ml of 
"RNAse free" DNAse for 30 minutes at room temperature, 
extracted with one volume of phenol/chloroform and 
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precipitated in the presence of sodium acetate and 
ethanol . 

The RNA obtained is quantified by an OD reading 
at 260 nm. The presence of MSRV-1 and the absence of DNA 
5 contaminant is monitored by a PCR and an MSRV-l-specif ic 
RTPCR associated with a specific ELOSA for the MSRV-1 
genome . 

Synthesis of cDNA: 

5 mg of RNA are used to synthesize a cDNA primed 
with a poly(DT) oligonucleotide according to the 
instructions of the "cDNA Synthesis Module" kit (ref 
RPN 1256, Amersham) with a few modifications: The reverse 
transcription is performed at 45°C instead of the 
recommended 42 °C. 

The synthesis product is purified by a double 
extraction and a double purification according to the 
manufacturer^ instructions. 

The presence of MSRV-1 is verified by an MSRV-1 
PCR associated with a specific ELOSA for the MSRV-1 
genome . 

"Long Distance PCR": (LD-PCR) 

500 ng of cDNA are used for the LD-PCR step 
(Expand Long Template System; Boehringer (ref. 1681 842)). 

Several pairs of oligonucleotides were used. 
Among these, the pair defined by the following primers: 
5» primer: GGAGAAGAGC AGCATAAGTG G (SEQ ID NO: 66) 
3' primer: GTGCTGATTG GTGTATTTAC AATCC (SEQ ID NO: 67). 

The amplification conditions are as follows: 
94 °C 10 seconds 
56°C 30 seconds 
68 °C 5 minutes; 
10 cycles, then 20 cycles with an increment of 
20 seconds in each cycle on the elongation time. At the 
end of this first amplification, 2 microlitres of the 
amplification product are subjected to a second 
amplification under the same conditions as before. 
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The LD-PCR reactions are conducted in a Perkin 
model 9600 PCR apparatus in thin-walled microtubes 
(Boehringer) . 

The amplification products are monitored by 
5 electrophoresis of l/5th of the amplification volume 
(10 microlitres) in 1% agarose gel. For the pair of 
■ primers described above, a band of approximately 1.7 Kb is 
obtained. 

Cloning of the amplified fragment: 
10 The PCR product was purified by passage through 

a preparative agarose gel and then through a Costar column 

(Spin; D. Dutcher) according to the supplier's 

instructions. 

2 microlitres of the purified solution are 
15 joined up with 50 ng of vector PCRII according to the 

supplier's instructions (TA Cloning Kit; British 

Biotechnology) ) . 

The recombinant vector obtained is isolated by 

transformation of competent DH5aF ' bacteria. The bacteria 
20 are selected using their resistance to ampicillin and the 

loss of metabolism for Xgal (= white colonies) . The 

molecular structure of the recombinant vector is confirmed 

by plasmid minipreparation and hydrolysis with the enzyme 

EcoRl . 

25 FBdl3 , a positive clone for all these criteria, 

was selected. A large-scale preparation of the recombinant 
plasmid was performed using the Midiprep Quiagen kit (ref 
12243) according to the supplier's instructions. 

Sequencing of the clone FBdl3 is performed by 

30 means of the Perkin Prism Ready Amplitaq FS dye terminator 
kit (ref. 402119) according to the manufacturer's 
instructiions. The sequence reactions are introduced into 
a Perkin type 377 or 373A automatic sequencer. The 
sequencing strategy consists in gene walking carried out 

35 on both strands of the clone Fbdl3. 
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The sequence of the clone FBdl3 is identified by 
SEQ ID NO: 58. 

In Figure 37 / the sequence homology between the 
clone FBdl3 and the HSERV-9 retrovirus is shown on the 
5 matrix chart by a continuous line for any partial homology 
greater than or equal to 70%. It can be seen that there 
are homologies in the flanking regions of the clone (with 
the pol gene at the 5' end and with the env gene and then 
the LTR at the 3» end), but that the internal region is 

10 totally divergent and does not display any homology, even 
weak, with the env gene of HSERV-9. Furthermore, it is 
apparent that the clone FBdl3 contains a longer "env" 
region than the one which is described for the defective 
endogenous HSERV-9; it may thus be seen that the internal 

15 divergent region constitutes an "insert" between the 
regions of partial homology with the HSERV-9 defective 
genes. 

This additional sequence determines a potential 
orf, designated ORF B13, which is represented by its amino 
20 acid sequence SEQ ID NO: 87. 

The molecular structure of the clone FBdl3 was 
analyzed using the GeneWork software and Genebank and 
SwissProt data banks. 

5 glycosylation sites were found. 
25 The protein does not have significant homology 

with already known sequences. 

It is probable that this clone originates from a 
recombination of an endogenous retroviral element (ERV) , 
linked to the replication of MSRV-1. 
30 Such a phenomenon does not lack generation of 

the expression of polypeptides, or even of endogenous 
retroviral proteins which are not necessarily tolerated by 
the immune system. Such a scheme of aberrant expression of 
endogenous elements related to MSRV-1 and/or induced by 
35 the latter is liable to multiply the aberrant antigens, 
and hence tends to contribute to the induction of 
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autoimmune processes such as are observed in MS. It 
clearly constitutes a novel element never hitherto 
described. In effect, interrogation of the data banks of 
nucleic acid sequences available in version No. 19 (1996) 
5 of the "Entrez" software (NCBI, NIH, Bethesda, USA) did 
not enable a known homologous sequence comprising the 
whole of the env region of this clone to be identified. 

EXAMPLE 14: OBTAINING A CLONE FP6 CONTAINING A 
PORTION OF THE pol GENE, WITH A REGION CODING FOR THE 
REVERSE TRANSCRIPTASE ENZYME HOMOLOGOUS TO THE CLONE POL* 
MSRV-1, AND A 3«pol REGION DIVERGENT FROM THE EQUIVALENT 
SEQUENCES DESCRIBED IN THE CLONES POL*, tpol, FBd3 , JLBcl 
and JLBC2 

A 3 ■ RACE was performed on total RNA extracted 
from plasma of a patient suffering from MS. A healthy 
control plasma treated under the same conditions was used 
as negative control. The synthesis of cDNA was carried out 
with the following modified oligo(dT) primer: 
5 1 GACTCGCTGC AGATCGATTT TTTTTTTTTT TTTT 3' (SEQ ID NO: 68) 
and Boehringer "Expand RT W reverse transcriptase 
according to the conditions recommended by the company. A 
PCR was performed with the enzyme Klentaq (Clontech) under 
the following conditions: 94 °C 5 min then 93 °C 1 min, 58 °C 
1 min, 68°C 3 min for 40 cycles and 68°C for 8 rain, and 
with a final reaction volume of 50 pi. 

Primers used for the PCR: 

- 5' primer, identified by SEQ ID NO: 69 
5 1 GCCATCAAGC CACCCAAGAA CTCTTAACTT 3 1 ; 

- 3' primer, identified by SEQ ID NO: 68 (=the 
same as for the cDNA) 

A second, so-called "semi-nested" PCR was 
carried out with a 5 1 primer located within the region 
already amplified. This second PCR was performed under the 
same experimental conditions as those used in the first 
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PCR, using 10 jil of the amplification product originating 
from the first PCR. 

Primers used for the semi-nested PCR: 

- 5' primer, identified by SEQ ID NO: 70 
5 5 » CCAATAGCCA GACCATTATA TACACTAATT 3 * ; 

- 3' primer, identified by SEQ ID NO: 68 (=the 
same as for the cDNa) 

Primers SEQ ID NO: 69 and SEQ ID NO: 70 are 
specific for the pol* region: position No, 403 to No. 422 

10 and No. 641 to No. 670, respectively. 

An amplification product was thus obtained from 
the extracellular RNA extracted from the plasma of a 
patient suffering from MS. The corresponding fragment was 
not observed for the plasma of the healthy control. This 

15 amplification product was cloned in the following manner. 

The amplified DNA was inserted into a plasmid 
using the TA Cloning™ kit. The 2 y.1 of DNA solution were 
mixed with 5 /xl of sterile distilled water, 1 pi of a 
10-fold concentrated ligation buffer "lOx LIGATION 

20 BUFFER", 2 /il of "pCR™ VECTOR" (25 ng/ml) and 1 Ml Of 
"TA DNA LIGASE". This mixture was incubated overnight at 
12 °C. The following steps were carried out according to 
the instructions of the TA Cloning™ kit (British 
Biotechnology) . At the end of the procedure, the white 

25 columns of recombinant bacteria (white) were picked out in 
order to be cultured and to permit extraction of the 
plasmids incorporated according to the so-called 
"miniprep" procedure (17) . The plasmid preparation from 
each recombinant colony was cut with a suitable 

30 restriction enzyme and analyzed on agarose gel. Plasmids 
possessing an insert detected under UV light after 
staining the gel with ethidium bromide was selected for 
sequencing of the insert, after hybridization with a 
primer complementary to the Sp6 promoter present on the 

35 cloning plasmid of the TA cloning kit™. The reaction prior 
to sequencing was then performed according to the method 
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recommended for the use of the sequencing kit "Prism ready 
reaction kit dye deoxyterminator cycle sequencing kit" 
(Applied Biosystems, ref. 401384), and automatic 
sequencing was carried out with an Applied Biosystems 
5 "Automatic Sequencer, model 373 A" apparatus according to 
the manufacturer's instructions. 

The clone obtained, designated FP6, enables a 
region of 467 bp which is 89% homologous to the pol* 
region of the MSRV-1 retrovirus and a region of 1167 bp 

10 which is 64% homologous to the pol region of ERV-9 
(No. 1634 to 2856) to be defined. 

The clone FP6 is represented in Figure 38 by its 
nucleotide sequence identified by SEQ ID NO: 61. The three 
potential reading frames of this clone are indicated by 

15 their amino acid sequence under the nucleotide sequence. 

EXAMPLE 15: OBTAINING A REGION DESIGNATED G+E+A 
CONTAINING AN ORF FOR A RETROVIRAL PROTEASE, BY PCR 
AMPLIFICATION OF THE NUCLEIC ACID SEQUENCE CONTAINED 
20 BETWEEN THE 5* REGION DEFINED BY THE CLONE "GM3" AND THE 
3 1 REGION DEFINED BY THE CLONE POL*, FROM THE RNA 
EXTRACTED FROM A POOL OF PLASMAS OF PATIENTS SUFFERING 
FROM MS 

Oligonucleotides specific for the MSRV-1 
25 sequences already identified by the Applicant were defined 
in order to amplify the retroviral RNA originating from 
virions present in the plasma of patients suffering from 
MS. Control reactions were performed so as to monitor the 
presence of contaminants (reaction with water) . The 
30 amplification consists of a step of RT-PCR followed by a 
"nested" PCR. Pairs of primers were defined for amplifying 
three overlapping regions (designated G, E and A) on the 
regions defined by the sequences of the clones GM3 and 
pol* described above. 

35 

Semi-nested RT-PCR for amplification of the region G: 
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- in the first RT-PCR cycle, the following 
primers are used: 

primer 1: SEQ ID NO: 71 (sense) 
primer 2: SEQ ID NO: 72 (antisense) 
5 - in the second PCR cycle, the following primers 

are used: 

primer 1: SEQ ID NO: 73 (sense) 
primer 4: SEQ ID NO:74 (antisense) 

Nested RT-PCR for amplification of the region E: 
10 - in the first RT-PCR cycle, the following 

primers are used: 

primer 5: SEQ ID NO: 75 (sense) 
primer 6: SEQ ID NO: 76 (antisense) 

- in the second PCR cycle, the following primers 

15 are used: 

primer 7: SEQ ID NO: 77 (sense) 
primer 8: SEQ ID NO: 78 (antisense) 
Semi-nested RT-PCR for amplification of the region A: 

- in the first RT-PCR cycle, the following 
20 primers are used: 

primer 9: SEQ ID NO:79 (sense) 
primer 10: SEQ ID NO: 80 (antisense) 

- in the second PCR cycle, the following primers 

are used: 

25 primer 9: SEQ ID NO: 81 (sense) 

primer 11: SEQ ID NO: 82 (antisense) 
The primers and the regions G, E and A which 
they define are positioned as follows: 
cDNA 

30 1 G 4 2 

5 7 E 8 6 

3 A 11 10 

< >< > 

GM3 POL* 
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The sequence of the region defined by the 
different clones G, E and A was determined after cloning 
and sequencing of the "nested" amplification products. 

The clones G, E and A were assembled together by 
5 PCR with the primers 1 at the 5' end of the fragment G and 
11 at the 3 1 end of the fragment A, the primers being 
described above. An approximately 1580-bp fragment G+E+A 
was amplified and inserted into a plasmid using the TA 
Cloning (trademark) kit. The sequence of the amplification 
10 product corresponding to G+E+A was determined and analysis 
of the G+E and E+A overlaps was carried out. The sequence 
is shown in Figure 39, and corresponds to the sequence SEQ 
ID NO: 89. 

A reading frame coding for an MSRV-1 retroviral 
15 protease was found in the region E. The amino acid 
sequence of the protease, identified by SEQ ID NO: 90, is 
presented in Figure 40. 

EXAMPLE 16: OBTAINING A CLONE LTRGAG12, RELATED 
20 TO AN ENDOGENOUS RETROVIRAL ELEMENT (ERV) CLOSE TO MSRV-1, 
IN THE DNA OF AN MS L YMPHOBL ASTO I D LINE PRODUCING VIRIONS 
AND EXPRESSING THE MSRV-1 RETROVIRUS 

A nested PCR was performed on the DNA extracted 
from a lymphoblastoid line (B lymphocytes immortalized 
25 with the EBV virus strain B95, as described above and as 
is well known to a person skilled in the art) expressing 
the MSRV-1 retrovirus and originating from peripheral 
blood lymphocytes of a patient suffering from MS. 

In the first PCR step, the following primers are 

30 used: 

primer 4 327: CTCGATTTCT TGCTGGGCCT TA (SEQ ID NO: 83) 
primer 3512: GTTGATTCCC TCCTCAAGCA (SEQ ID NO: 84) 

This step comprises 35 amplification cycles with 
the following conditions: 1 min at 94 °C, 1 min at 54°C and 
35 4 min at 72°C. 
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In the second PCR step, the following primers 

are used: 

primer 4294: CTCTACCAAT CAGCATGTGG (SEQ ID NO: 85) 
primer 3591: TGTTCCTCTT GGTCCCTAT (SEQ ID NO: 86) 

5 This step comprises 35 amplification cycles with 

the following conditions: 1 min at 94°C / 1 min at 54 °C and 

4 min at 72°C. 

The products originating from the PCR were 
purified after purification on agarose gel according to 

10 conventional methods (17), and then resuspended in 10ml 
of distilled water. Since one of the properties of Taq 
polymerase consists in adding an adenine at the 3 1 end of 
each of the two DNA strands, the DNA obtained was inserted 
directly into a plasmid using the TA Cloning™ kit (British 

15 Biotechnology) . The 2 Ml of DNA solution were mixed with 

5 /xl of sterile distilled water, 1 |il of a 10-fold 
concentrated ligation buffer "lOx LIGATION BUFFER", 2 pi 
Of "pCR™ VECTOR" (25 ng/ml) and 1 Ml Of "TA DNA LIGASE" . 
This mixture was incubated overnight at 12 °C. The 

20 following steps were carried out according to the 
instructions of the TA Cloning™ kit (British 
Biotechnology) • At the end of the procedure, the white 
colonies of recombinant bacteria (white) were picked out 
in order to be cultured and to permit extraction of the 

25 plasmids incorporated according to the so-called 
"miniprep" procedure (17) . The plasmid preparation from 
each recombinant colony was cut with a suitable 
restriction enzyme and analyzed on agarose gel. The 
plasmids possessing an insert detected under UV light 

30 after staining the gel with ethidium bromide were selected 
for sequencing of the insert, after hybridization with a 
primer complementary to the Sp6 promoter present on the 
cloning plasmid of the TA Cloning Kit™. The reaction prior 
to sequencing was then performed according to the method 

35 recommended for the use of the sequencing kit "Prism ready 
reaction kit dye deoxyterminator cycle sequencing kit" 
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(Applied Biosystems, ref. 401384), and automatic 
sequencing was carried out with an Applied Biosystems 
"Automatic Sequencer, model 373 A" apparatus according to 
the manufacturer's instructions. 
5 Thus, a clone designated LTRGAG12 could be 

obtained, and is represented by its internal sequence 
identified by SEQ ID NO: 60. 

This clone is probably representative of 
endogenous elements close to ERV-9, present in human DNA, 
in particular in the DNA of patients suffering from MS, 
and capable of interfering with the expression of the 
MSRV-1 retrovirus, hence capable of having a role in the 
pathogenesis associated with the MSRV-l retrovirus and 
capable of serving as marker for a specific expression in 
the pathology in question. 

EXAMPLE 17: DETECTION OF ANTI-MSRV-1 SPECIFIC 
ANTIBODIES IN HUMAN SERUM 

Identification of the sequence of the pol gene 
of the MSRV-1 retrovirus and of an open reading frame of 
this gene enabled the amino acid sequence SEQ ID NO: 63 of 
a region of the said gene, referenced SEQ ID NO: 62, to be 
determined. 

Different synthetic peptides corresponding to 
fragments of the protein sequence of MSRV-1 reverse 
transcriptase encoded by the pol gene were tested for 
their antigenic specificity with respect to sera of 
patients suffering from MS and of healthy controls. 

The peptides were synthesized chemically by 
solid-phase synthesis according to the Merrifield tech- 
nique (22). The practical details are those described 
below. 

a) Peptide synthesis: 

The peptides were synthesized on a phenylacet- 
amidomethyl (PAM) /polystyrene/divinylbenzene resin 

(Applied Biosystems, Inc. Foster City, CA) , using an 
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"Applied Biosystems 430A" automatic synthesizer. The amino 
acids are coupled in the form of hydroxybenzotriazole 
(HOBT) esters. The amino acids used are obtained from 
Novabiochem (LSuf lerlf ingen, Switzerland) or Bachem 
5 (Bubendorf, Switzerland). 

The chemical synthesis was performed using a 
double coupling protocol with N-methylpyrrolidone (NMP) as 
solvent. The peptides were cut from the resin, as well as 
the side-chain protective groups, simultaneously, using 

10 hydrofluoric acid (HF) in a suitable apparatus (type I 
cleavage apparatus, Peptide Instiute, Osaka, Japan) . 

For 1 g of peptidyl resin, 10 ml of HF, 1 ml of 
anisole and 1 ml of dimethyl sulphide 5DMS are used. The 
mixture is stirred for 45 minutes at -2°C. The HF is then 

15 evaporated off under vacuum* After intensive washes with 
ether, the peptide is eluted from the resin with 10% 
acetic acid and then lyophilized. 

The peptides are purified by preparative high 
performance liquid chromatography on a VYDAC C18 type 

20 column (250 x 21 mm) (The Separation Group, Hesperia, CA, 
USA) * Elution is carried out with an acetonitrile gradient 
at a flow rate of 22 ml/min. The fractions collected are 
monitored by an elution under isocratic conditions on a 
VYDAC™ C18 analytical column (250 x 4.6 mm) at a flow rate 

25 of 1 ml/min. Fractions having the same retention time are 
pooled and lyophilized. The preponderant fraction is then 
analysed by analytical high performance liquid 
chromatography with the system described above. The 
peptide which is considered to be of acceptable purity 

30 manifests itself in a single peak representing not less 
than 95% of the chromatogram. 

The purified peptides are then analysed with the 
object of monitoring their amino acid composition, using 
an Applied Biosystems 420H automatic amino acid analyser. 

35 Measurement of the (average) chemical molecular mass of 
the peptides is obtained using LSIMS mass spectrometry in 
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the positive ion mode on a VG. ZAB.ZSEQ double focusing 
instrument connected to a DEC-VAX 2000 acquisition system 
(VG analytical Ltd, Manchester, England). 

The reactivity of the different peptides was 
5 tested against sera of patients suffering from MS and 
against sera of healthy controls. This enabled a peptide 
designated S24Q to be selected, whose sequence is 
identified by SEQ ID NO: 63, encoded by a nucleotide 
sequence of the pol gene of MSRV-1 (SEQ ID NO: 62). 

b) Antigenic properties: 

The antigenic properties of the S24Q peptide 
were demonstrated according to the ELISA protocol 
described below. 

The lyophilized S24Q peptide was dissolved in 
10 % acetic acid at a concentration of 1 mg/ml. This stock 
solution was aliquoted and kept at +4°C for use over a 
fortnight, or frozen at -20 °C for use within 2 months. An 
aliquot is diluted in PBS (phosphate buffered saline) 
solution so as to obtain a final peptide concentration of 
5 micrograms/ml. 100 microlitres of this dilution are 
placed in each well of Nunc Maxisorb (trade name) 
microtitration plates. The plates are covered with a 
"plate-sealer" type adhesive and kept for 2 hours at +37 °C 
for the phase of adsorption of the peptide to the plastic. 
The adhesive is removed and the plates are washed three 
times with a volume of 3 00 microlitres of a solution A 
(IX" PBS, 0.05% Tween 20®), then inverted over an 
absorbent tissue. The plates thus drained are filled with 
250 microlitres per well of a solution B (solution A + 10% 
of goat serum) , then covered with an adhesive and 
incubated for 1 hour at 37 °C. The plates are then washed 
three times with the solution A as described above. 

The test serum samples are diluted beforehand to 
1/100 in the solution B, and 100 microlitres of each 
dilute test serum are placed in the wells of each micro- 
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titration plate. A negative control is placed in one well 
of each plate, in the form of 100 microlitres of buffer B. 
The plates covered with an adhesive are then incubated for 
1 hour 30 min at 37 °C. The plates are then washed three 
5 times with the solution A as described above. For the IgG 
response, a peroxidase-labelled goat antibody directed 
against human IgG (marketed by Jackson Immuno Research 
Inc.) is diluted in the solution B (dilution . 1/10, 000) . 
100 microlitres of the appropriate dilution of the 
labelled antibody are then placed in each well of the 
microtitration plates, and the plates covered with an 
adhesive are incubated for 1 hour at 37 °C. A further 
washing of the plates is then performed as described 
above. In parallel, the peroxidase substrate is prepared 
according to the directions of the bioMerieux kits. 100 
microlitres of substrate solution are placed in each well, 
and the plates are placed protected from light for 20 to 
30 minutes at room temperature. 

When the colour reaction has stabilized, 
50 microlitres of Color 2 (bioMerieux trade name) are 
placed in each well in order to stop the reaction. The 
plates are placed immediately in an ELISA plate 
spectrophotometric reader, and the optical density (OD) of 
each well is read at a wavelength of 492 nm. 

The serological samples are introduced in dupli- 
cate or in triplicate, and the optical density (OD) 
corresponding to the serum tested is calculated by taking 
the mean of the OD values obtained for the same sample at 
the same dilution. 

The net OD of each serum corresponds to the mean 
OD of the serum minus the mean OD of the negative control 
(solution B: PBS, 0.05% Tween 20x, 10% goat serum). 

c) Detection of anti-MSRV-1 IgG antibodies 
(S24Q) by ELISA: 

The technique described above was used with the 
S24Q peptide to test for the presence of anti-MSRV-1 
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specific IgG antibodies in the serum of 15 patients for 
whom a definite diagnosis of MS was established according 
to the criteria of Poser (23) , and of 15 healthy controls 
(blood donors) . 

5 Figure 41 shows the results for each serum 

tested with an anti-IgG antibody. Each vertical bar 
represents the net optical density (OD at 492 nm) of a 
serum tested. The ordinate axis gives the net OD at the 
top of the vertical bars. The first 15 vertical bars lying 

10 to the left of the vertical broken line represent the sera 
of 15 healthy controls (blood donors) , and the 15 vertical 
bars lying to the right of the vertical broken line 
represent the sera of 15 cases of MS tested. The diagram 
enables 2 controls to be revealed whose OD rises above the 

15 grouped values of the control population. These values may 
represent the presence of specific IgGs in symptomless 
seropositive patients. Two methods were hence evaluated in 
order to determine the statistical threshold of positivity 
of the test. 

20 The mean of the net OD values for the controls, 

including the controls with high net OD values, is 0.129 
and the standard deviation is 0.06. Without the 2 controls 
whose OD values are greater than 0.2, the mean of the 
"negative" controls is 0.107 and the standard deviation is 

25 0.03. A theoretical threshold of positivity may be 
calculated according to the formula: 

threshold value (mean of the net OD values of the 
negative controls) + ( 2 or 3 standard deviation 
30 of the net OD values of the negative controls) . 

In the first case, there are considered to be 
symptomless seropositives, and the threshold value is 
equal to 0.11 + (3 x 0.03) = 0.20. The negative results 
35 represent a non-specific "background" of the presence of 
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antibodies directed specifically against an epitope of the 
peptide. 

In the second case, if the set of controls 
consisting of blood donors in apparent good health is 
5 taken as a reference basis, without excluding the sera 
which are, on the face of it, seropositive, the standard 
deviation of the "non-MS controls" is 0.116. The threshold 
value then becomes 0.13 + (3 x 0.06) = 0.31. 

According to this latter analysis, the test is 

10 specific for MS. In this respect, it is seen that the test 
is specific for MS, since, as shown in Table 1, no control 
has a net OD above this threshold. In fact, this result 
reflects the fact that the antibody titres in patients 
suffering from MS are, for the most part, higher than in 

15 healthy controls who have been in contact with MSRV-1. 

In accordance with the first method of calcula- 
tion, and as shown in Figure 41 and in Table 3, 6 of the 
15 MS sera give a positive result (OD greater than or 
equal to 0.2), indicating the presence of IgGs 

20 specifically directed against the S24Q peptide, hence 
against a portion of the reverse transcriptase enzyme of 
the MSRV-1 retrovirus encoded by its pol gene, and 
consequently against the MSRV-1 retrovirus. 

Thus, approximately 40% of the MS patients 

25 tested have reacted against an epitope carried by the S24Q 
peptide and possess circulating IgGs directed against the 
latter. 

Two out of 15 blood donors in apparent good 
health show a positive result. Thus, it is apparent that 

30 approximately 13% of the symptomless population may have 
been in contact with an epitope carried by the S24Q 
peptide under conditions which have led to an active 
immunization which manifests itself in the persistence of 
specific serum IgGs. These conditions are compatible with 

35 an immunization against the MSRV-1 retrovirus reverse 
transcriptase during an infection with (and/or react iva- 
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tion of) the MSRV-1 retrovirus. The absence of apparent 
neurological pathology recalling MS in these seropositive 
controls may indicate that they are healthy carriers and 
have eliminated an infectious virus after immunizing 
5 themselves , or that they constitute an at-risk population 
of chronic carriers. In effect, epidemiological data 
showing that a pathogenic agent present in the environment 
of regions of high prevalence of MS may be the cause of 
this disease imply that a fraction of the population free 

10 from MS has necessarily been in contact with such a 
pathogenic agent. It has been shown that the MSRV-1 
retrovirus constitutes all or part of this "pathogenic 
agent" at the source of MS, and it is hence normal for 
controls taken from a healthy population to possess IgG 

15 type antibodies against components of the MSRV-1 
retrovirus. 

Lastly, the detection of anti-S24Q antibodies in 
only one out of two MS cases tested here may reflect the 
fact that this peptide does not represent an 

20 immunodominant MSRV-1 epitope, that inter-individual 
strain variations may induce an immunization against a 
divergent peptide motif in the same region, or that the 
course of the disease and the treatments followed may 
modulate over time the antibody response against the S24Q 

25 peptide. 



30 



35 
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TABLE No. 3 



94 



10 



Mean 

Std. Dev. 
15 Threshold 



CONTROLS 


MS 


0.101 


0.136 


0.058 


0 .391 


.0. 126 


0 .37 


0.131 


0 .119 


0.105 


0 .267 


0 .294 


0 .141 


0.116 


0.102 


0.088 


0.18 


0.1 05 


0.411 


0. 172 


0.164 


0. 137 


0.049 


0.223 


0 . 644 


0.08 


0.268 


0.073 


0.065 


0.132 


0.074 


0.129 




0.06 




t).31 





d) Detection of anti-MSRV-1 IgM antibodies by 

ELISA: 

20 The ELISA technique with the S24Q peptide was 

used to test for the presence of anti-MSRV-1 IgM specific 
antibodies in the same sera as above. 

Figure 42 shows the results for each serum tested 
with an anti-IgM antibody. Each vertical bar represents 

25 the net optical density (OD at 492 nm) of a serum tested. 
The ordinate axis gives the net OD at the top of the 
vertical bars. The first 15 vertical bars lying to the 
left of the vertical line cutting the abscissa axis 
represent the sera of 15 healthy controls (blood donors) , 

30 and the vertical bars lying to the right of the vertical 
broken line represent the sera of 15 cases of MS tested. 

The mean of the OD values for the MS cases 

tested is 1.6. 

The mean of the net OD values for the controls 

35 is 0.7. 
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The standard deviation of the negative controls 

is 0.6. 

The threshold of theoretical positivity may be 
calculated according to the formula: 

threshold value = (mean of the OD values of the negative 

controls) + (3 x standard deviation of 
the OD values of the negative controls) 



10 The threshold value is hence equal to 0.7 + (3 x 0.6) = 
2.5; 

The negative results represent a non-specific 
"background" of the presence of antibodies directed 
specifically against an epitope of the peptide. 
15 According to this analysis, and as shown in 

Figure 42 and in the corresponding Table 4, the IgM test 
is specific for MS, since no control has a net OD above 
the threshold. 6 of the 15 MS sera produce a positive IgM 
result 

20 The difference in seroprevalence between the MS 

and control populations is extremely significant: 
"chi-squared" test, p < 0.002. 

These results point to an aetiopathogenic role 
of MSRV-1 in MS. 

25 Thus, the detection of IgM and IgG antibodies 

against the S24Q peptide makes it possible to evaluate, 
alone or in combination with other MSRV-1 peptides, the 
course of an MSRV-1 infection and/or of the viral 
reactivation of MSRV-1. 
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TABLE No* 4 



CONTROLS 



MS 



10 



5 



1.449 

0.371 

0.448 

0.456 

0.885 

2.235 

0.301 

0.138 

0.16 

1.073 

1.366 

0.283 

0.262 

0.585 

0.356 



0.974 
6.117 
2.883 
1.945 
1.787 
0.273 
1.766 
0.668 
2 .603 
0 .802 
0.245 
0.147 
2.441 
0.287 
0.589 



Mean 

Std. Dev. 
Threshold 
Value 



0.7 
0.6 



15 



2.5 



It is possible, as a result of the new 



discoveries made and the new methods developed by the 
inventors, to permit the improved implementation of 

20 diagnostic tests for MSRV-1 infection and/or reactivation 
and to evaluate a therapy in MS and/ or RA on the basis of 
its efficacy in "negativing" the detection of these agents 
in the patients biological fluids. Furthermore, early 
detection in individuals not yet displaying neurological 

25 signs of MS or rheumatological signs of RA could make it 
possible to institute a treatment which would be all the 
more effective with respect to the subsequent clinical 
course for the fact that it would precede the lesion stage 
which corresponds to the onset of the clinical disorders* 

30 Now, at the present time, a diagnosis of MS or RA cannot 
be established before a symptomatology of lesions has set 
in, and hence no treatment is instituted before the 
emergence of a clinical picture suggestive of lesions 
which are already significant. The diagnosis of an MSRV-1 

35 and/or MSRV-2 infection and/or reactivation in man is 
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hence of decisive importance, and the present invention 
provides the means of doing this. 

It is thus possible, apart from carrying out a 
diagnosis of MSRV-1 infection and/or reactivation, to 
5 evaluate a therapy in MS on the basis of its efficacy in 
"negativing" the detection of these agents in the 
patients' biological fluids. 

EXAMPLE 18 : 

10 1) MATERIALS AND METHODS 

- Patients and clinical samples 
Choroid plexus cells from MS patients and 

controls were obtained from the brain-cell library, 
Laboratoire R. Escourolles, Hopital de la Salpetriere, 
Paris, France. Non-tumoral leptomeningeal cells from 
controls were obtained as previously described (26) . 
Peripheral blood from MS and control patients used for 
obtaining B-cell lines and plasma, were obtained from the 
Neurological Departments, CHU de Grenoble, and from 
INSERM U 134, Hopital de la Salpetriere, France. Clinical 
details and origin of the 10 MS patients and of the 10 
patients with other neurological diseases who provided CSF 
samples are given in Table 6. 

- Cell cultures, virus isolation and purification 
All cell-types were cultured as previously 

described (3, 5, 26). 

All cultures were regularly screened for mycoplasma 
contamination with an ELISA mycoplasma-detection kit 
(Boehringer) . No cell-extract nor supernatant used 
contained detectable mycoplasma. 

Extracellular virion purification and sucrose density 
gradients were performed as previously described (3, 5, 
26). From each sucrose gradient 0.5-lml fractions were 
collected from the top of the tubes, with a 1000/il 
Pipetman and a different sterile tip for each fraction. 
60/il were used for RT activity assay and the rest was 
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mixed with 1 volume of buffer containing 4M guanidinium 
thiocyanate, 0.5% N-Lauroyl sarcosin, 25mM EDTA, 0.2% B- 
mercaptoethanol adjusted at pH 5.5 with acetic acid. These 
mixtures were frozen at -80°C for futher RNA extraction 
5 or directly processed according to Chomzynski (20) , with 
an overnight precipitation step at -20°C, in presence of 
RNase-free glycogen (Boehringer) . RNA was dissolved 20 to 
50jil of DEPOtreated water in the presence of 1-2/il of 
recombinant RNase-inhibitor (PROMEGA) and 0,lmM DTT. 10/ul 

10 aliquots were used for each RT-PCR. 
- Reverse transcriptase activity 

RT-activity was tested with 20mM Mg ++ and poly- 
Cm or polyC templates, in virion pellets or fractions from 
sucrose gradients as previously described (3, 5, 26). 

15 - cDNA synthesis and 'Pan-retro 1 RT-PCR with degenerate 
primers 

A total RT-activity between 10 6 -10 7 dpm was 
required in the fraction containing the peak of purified 
virions. The "Pan-retro" RT-PCR technique (27) was 

20 performed on virion RNA extracted by the method of 
Chomczynski (20) and dissolved in 20 pi RNase-free water. 
5 /ul RNA solution was incubated for 30 min at 37 °C with 
0.3 units (3 units for CSF series) of RNase-free DNase-1 
(Boehringer) in a 20 Mi reaction containing 7.5 mM random 

25 hexamers, 5 mM Hepes-HCl pH 6.9, 75 mM KC1 # 3 mM MgCl2, 10 
mM DTT, 50 mM Tris-HCl pH 7.5, 0.5 mM each dNTP, and 20 
units recombinant RNase inhibitor (Promega) . The DNase was 
then heat inactivated at 80 °C for 10 min. 20 units MoMLV 
RT (Pharmacia) and a further 20 units of RNase inhibitor 

TM 

30 were added to each tube in a Genesphere enclosure 
(Safetech, Ireland) and cDNA was synthesised for 90 min at 
37 °C. Following reverse transcription, the cDNA was boiled 
for 5 min then cooled rapidly on ice. The Round 1 PGR mix 
(final volume 25 jxl P er reaction; 20 mM Tris-HCl pH 8.4, 

35 60 mM KC1, 2.5 mM MgCl2/ 200 ng each of primers PAN-UO and 
PAN-DI [see Figure 44], 0.2 mM each dNTP) was treated with 
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0.3 units DNase-1 and then heat inactivated as above. 

TM 

2.5 /xl cDNA was added xn the Genesphere enclosure and the 
tubes heated to 80°C before adding 0.5 units Taq 
polymerase (Perkin Elmer) individually to each tube ("hot 
5 start") . Round 1 PCR parameters were 3 5 cycles of 95 °C for 
1 min f 34°C for 30 sec, 72°C for 1 min, with a final 7 min 
extension at 72°C. 0.5 /xl of Round 1 PCR product was 
transferred to the Round 2 DNase-treated PCR mix 
(composition as for Round 1 but containing primers PAN-UI 
10 and PAN-DI) using the "hot start" procedure. Round 2 PCR 
parameters were as for Round 1 but using 30 cycles only 
and annealing at 45°C for 1 min. 

- Cloning of PCR products 

PCR products were cloned using the TA-cloning® 
15 kit (British Biotechnology) according to the 
manufacturer's recommendations. 

- Sequencing 

Sequencing reactions were performed using the 
"Prism ready reaction kit dye deoxyterminator cycle 
20 sequencing kit" (Applied Biosystems) . Automatic sequence 
analysis was performed on an automatic sequencer (Applied 
Biosystems, 373 A) . 

- RT-PCR with ST1 primer sets 

The first PCR round was performed directly from the 
25 cDNA reaction mixture according to the one-step RT-PCR 
technique described by Mallet et al. (28). This one-step 
RT-PCR procedure reduced the probability of airborne 
contamination when opening the tubes and transferring PCR 
reagents after an independent cDNA synthesis. RNA was 
30 extracted as previously from 2ml of plasma (snap-frozen in 
liquid nitrogen and stored at -80 °C) or from a 500 /xl 
sucrose fraction with a total RT-activity above 10 6 dpm, 
and resuspended in 50 /il of RNase-free water. For each RT- 
PCR reaction 10/il of RNA solution was incubated in a 
35 Perkin-Elmer 480 thermocycler , 15 min at 20°C with 1U of 
RNase-free DNASE 1 and 1.2 Ml of 10X DNASE buffer (50mM 
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Tris, lOmM MgC12 and 0,lmM DTT) containing lU/^l of RNase- 
inhibitor (PROMEGA) , and heated at 70°C for 10 min for 
DNase inactivation. The solution was placed on ice and 
mixed (in conditions preventing airborne dust/DNA 
5 contamination) with 88 /il of PCR mix containing: IX taq 
buffer, 25 nM/tube dNTPs, 40pM/tube of each first round 
primer (ST1.1 upstream primer: 

5 1 AGGAGTAAGGAAACCCAACGGAC 3 ' ( SEQ ID NO: 99) ; ST1 . 1 
downstream primer: 5 •TAAGAGTTGCACAAGTGCG 3' (SEQ ID 

10 NO:100)), 2.5U/tube of taq (Appligene) and lOU/tube of 
AMV-RT (Boehringer) . Each tube iwas further incubated in a 
Perkin-Elmer 480 thermocycler for 10 min at 65 °C, followed 
by 2h at 42 °C for cDNA synthesis and 5 min at 95°C for 
inactivation of AMV-RT and DNA denaturation. First round 

15 parameters were 40 cycles of 95°C for 1 min, 53°C for 2.5 
min, 72°C for 1 min, with a final extension of 10 min at 
72 °C. 10/il of the first round were transferred to the 
second round PCR mix previously treated at 20°C for 15 min 
with RNase-free DNase 1 (0.02U//xl) followed by DNase 

20 inactivation at 7 0°C for 10 min. This mix contained IX taq 
buffer, 25 nM/tube dNTPs, 40pM/tube of each second round 
primers [ ST1 . 2 upstream primer : 5 1 TC AGG GAT AG CCC C C ATCTAT 3 1 
(SEQ ID NO: 101); ST1.2 downstream primer: 

5 1 AACCCTTTGCCACTACATCAATTT3 1 (SEQ ID NO: 102)] and 

25 2.5U/tube of taq (Appligene). Second round parameters 
were 30 cycles of 95°C for 1 min, 53°C for 1.5 min, 72°C 
for 1 min, with a final extension of 8 min at 72°c. 20/xl 
of this nested RT-PCR product were deposited on a 0,7% 
agarose gel containing ethidium bromide and exposed to UV 

30 light for the visualization of amplified products. 

- Hybridisation analysis of PCR products: MSRV-pol 
detection by ELOSA 

The protocol was essentially as previously 
described (21) but with the following modifications: Nunc 

35 Maxisorb microtitre plates were coated with 100 ng per 
well capture probe CpVlb (see Figure 44) either by passive 
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adsorption (21) or alternatively by using streptavidin 
coated plates and biotinylated CpVlb, Peroxidase-labelled 
detector probe DpVl (see Figure 44) was used and the assay 
cut-off was defined as the mean of 4 negative controls 
5 plus 0.2 OD492 units. 

- RNA extraction, cDNA synthesis and PGR amplification 

from MS plasma samples : 

Total RNA was extracted from human MS plasma by 

a guanidium method as described elsewhere (29) . Total RNA 
10 extracted from 100 ul of plasma, were treated with RNase- 

free DNase I (O.IU/mI; Boehringer Manheim, France) and 

reverse transcribed under the conditions recommended by 

the manufacturer, using Superscript reverse transcriptase 

(Gibco-BRL, FRANCE) . The resulting cDNAs were amplified by 
15 semi-nested PCR through 3 5 cycles (94 °C 1 min, 55 °C 1 ran, 

72 °C 1 min 30 sec) and 72 °C 8 min for a final extension. 

Three different fragments in the RT region were amplified 

by the following specific primers : 

- in the protease (PRT) region, for the 1st and 
20 2nd round of PCR, respectively, sense primer 

[5» TCC AGC AGC AGG ACT GAG GGT 3» (SEQ ID NO: 103)] and 
antisense primers (5' CTG TCC GTT GGG TTT CCT TAC TCC T 3' 
(SEQ ID NO: 104) / 5' GAC AGC AAA TGG GTA TTC CTT TCC 3' 
(SEQ ID NO: 105) ] 

25 - in the fragment A of the RT region (Cf. Fig 

46) , for the 1st and 2nd round of PCR, respectively, sense 
primer [5' AGG AGT AAG GAA ACC CAA CGG ACA G 3' (SEQ ID 
NO: 106)] and antisense primers [5 1 TGT ATA TAA TGG TCT GGC 
TAT TGG G 3' (SEQ ID NO: 107) / 5' TTC GGC AGA AAC CTG TTA 

30 TGC CAA GG 3' (SEQ ID NO: 108)] 

- in the fragment B of the RT region (Cf. Fig. 
46) , for the 1st and 2nd round of PCR, respectively, sense 
primers [5 1 GGC TCT GCT CAC AGG AGA TTA GAT AC 3' (SEQ ID 
NO: 109) / 5' AAA GGC ACC AGG GCC CTC AGT GAG GA 3' (SEQ ID 

35 NO: 110)] and antisense primer 3* [5* GGT TTA AGA GTT GCA 
CAA GTG CGC AGT C 3* (SEQ ID NO: 101)]. 
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The amplified fragments were analysed on 
ethidium bromide-stained agarose gels, cloned in TA 
cloning vector (Invitrogen) and sequenced. 
2) RESULTS 

5 - Specific retroviral RNA is found in extracellular 
virions from MS patient-derived cell cultures and in MS 
patients 1 CSF. 

Choroid plexus cells (4) (obtained post-mortem) 
and EBV- immortalized peripheral blood B-lymphocytes (30, 
10 31) from MS patients gave rise to cultures expressing 100- 
120 nm viral particles associated with RT-activity similar 
to that of the original LM7 isolate (3) . Similar cell- 
types from non-MS donors produced neither this RT-activity 
nor virions. All the 'infected* cultures were poorly 
15 and/or transiently productive and/or had a limited 
lifespan. Therefore, in order to analyse the genomic RNA 
present in the very limited quantity of extracellular 
virions, we used an RT-PCR approach to amplify, with 
degenerate primers, a conserved region of the pol gene 
20 present in all known retroviruses (12); the techniques 
based on this approach will be called "Pan-retro" RT-PCR. 
Extensive DNAse treatment of samples and reagents was 
essential, because human DNA contains many endogenous 
retroviral elements amplifiable by this technique. 
25 "Pan-retro" RT-PCR experiments were performed on sucrose- 
density gradient purified virions from supernatants of 
different types of cell cultures and their non-infected 
controls: (i) choroid plexus cells sampled post-mortem 
from MS brain (PLI-1) , (ii) choroid plexus cells from non- 
30 MS brain autopsy, infected by co-culture with irradiated 
LM7 cells (LM7P) , and (iii) identical non-infected 
choroid-plexus cells. "Early" B-cell lines obtained by 
spontaneous in vitro transformation of two EBV- 
seropositive individuals, (iv) one MS patient and (v) one 
35 non-MS control, were also analysed. Figure 43 illustrates 
the RT-activity in sucrose-gradient fractions obtained 



5/9/2006, EAST Version: 2.0.3.0 



WO 98/23755 



PCT/IB97/01482 



103 

from the B-cell cultures. The technique described by Shih 
et al. (12) was modified in a semi-nested RT-PCR protocol 
(27) using degenerate primers (Fig. 2) and extensive DNase 
treatment. PCR amplifications were performed in London 
5 (Dpt of Virology, U.C.L.M.S.) on coded aliquots of the 
density gradient fractions. Blind and systematic cloning 
and sequencing of the PCR products were undertaken in an 
independent laboratory (bioMerieux, Lyon) . After complete 
sequencing of 20 to 30 clones per sucrose gradient 
10 fraction, the codes were broken and results analysed in 
parallel with the RT-activity data. 

Table 5 presents the distribution of sequences obtained 
from sucrose gradient fractions containing the peak of 
viral RT-activity in MS-derived cultures and also the 

15 sequences amplified from the corresponding RT-activity 
negative fractions of uninfected cultures. The predominant 
sequence detected in bands of the expected size (1.140 bp) 
amplified in all the RT-activity positive fractions (but 
not in the RT-activity negative fractions) was different 

20 from known retroviruses and was designated MSRV-cpol. 
MSRV-cpol sequences exhibited partial homology (70-75%) 
with ERV9, a previously described endogenous retroviral 
sequence (18). A few ERV9 sequences (>90% homology with 
ERV9) were also present but clearly represented a minority 

25 of clones. In addition to typical pol sequences, numerous 
PCR artefacts (primer multimers, concatemers or single- 
primer amplifications) related to the use of degenerate 
primers and low-temperature annealing, were found in all 
samples (Table 5) . 

30 Figure 44 shows an alignment of a consensus sequence of 
MSRV-cpol with the corresponding VLPQG / YMDD region of 
diverse retroviruses. Figure 45 displays a phylogenic tree 
based on the evolutionarily conserved amino acid sequences 
of both exogenous and endogenous retroviruses in this 

35 region. From this tree it can be seen that the pol gene of 



5/9/2006, EAST Version: 2.0.3.0 



WO 98/23755 



PCT7EB97/01482 



104 

MSRV is phylogenically related to the Otype group of 
oncovirinae. 

A small scale study was performed to determine the 
prevalence of MSRV c-pol sequences in the CSF of patients 
5 with MS. Identification of MSRV-cpol in PCR products by 
cloning and sequencing is both laborious and time 
consuming. We therefore devised an enzyme-linked 
oligosorbent assay (ELOSA) , using a capture probe (CpVlB) 
and a peroxidase-labelled detector probe (DpVl) , for the 

10 rapid identification of MSRV-cpol sequences in "Pan- 
retrovirus* PCR products (Figure 44). The specificity of 
this sandwich hybridisation-based assay for HMSRV-cpol was 
tested with both distantly related (HIV and MoMLV) and 
closely related (ERV9) pol sequences. No significant cross 

15 reactivity with such targets was observed despite the 
ability of the ELOSA to detect as little as 0.01 ng of 
MSRV-cpol DNA. 

Cerebrospinal fluid (CSF) samples were available from 10 
patients with MS and from 10 patients with other 

20 neurological disorders. Total RNA was extracted from CSF 
pellets , reverse transcribed and amplified as above. ELOSA 
analysis (Table 6) of the PCR products revealed MSRV-cpol 
sequences in 5 of the 10 MS patient samples but in none of 
the 10 samples from patients with other neurological 

25 diseases (P<0.05). The presence of MSRV-cpol did not 
appear to be correlated with age, sex or type of MS, but 
was seen in untreated patients only (5/6) . No patient with 
immunosuppressive therapy was found positive (0/4). No 
correlation between MSRV-cpol detection and CSF cell count 

30 was observed. 

- Cloning and sequencing a larger region of the pol gene 

An independent identification of the MSRV 
genomic sequence was obtained by a non-PCR approach using 
RNA extracted from concentrated virions derived from 2,5 

35 liters of LM7-infected sub-cultures of choroid plexus 
cells. A limited number of clones was obtained by direct 
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cloning of the cDNA, one of which (PSJ17) showed partial 
homology with ERV9 pol. Specific primers based on the 
MSRV-cpol region and on the PSJ17 clone, amplified a 740 
bp fragment linking the two independent sequences in RNA 
5 extracted from purified virions. PSJ17 was localised on 
the 3 1 side of MSRV-cpol . Further sequence extension on 
the 5' side of MSRV-cpol and on the 3' side of PSJ17 , was 
obtained using RT-PCR approaches on RNA from purified LM7- 
like virions produced in MS choroid plexus cultures (4) . 

10 In Figure 46, the nucleotide sequence 

corresponding to overlapping clones obtained by sequence 
extension in the pol gene is represented with the 
aminoacid translation corresponding to the putative open 
reading frames (ORFs) of the protease and of the reverse- 

15 transcriptase. The active site motifs of the protease 
(PRT) and of the reverse-transcriptase (RT) are 
underlined. In the C-terminal region of the RT sequence, 
the dispersed amino acid residues regularly present in 
retroviral RNase H domains, are also underlined. 

20 - Non-degenerate primers detect MSRV-specif ic RNA in 
virions associated with the peak of RT-activity . and in 
in MS patients' plasma 

PCR primers (ST1.1 primer set; positions 603-625/1732- 
1714, on Fig. 4) based on overlapping clones in the pol 

25 gene, amplified a 1.15 kb segment of the RT region from 
several different isolates obtained from different MS 
patients. Nested primers (ST1.2; positions 869-889/1513- 
1490, on Fig. 46) generated a 700 bp fragment (Figure 47) 
which was more easily visualised by ethidium bromide 

30 staining than the first round product generated by ST1.1. 
The specificity of PCR products was confirmed by stringent 
hybridisation with a peroxidase-labeled MSRV-cpol probe 
(Fig. 44), using the ELOSA technique (21). 

The ST1.1 and 2 primer set was used to detect 
35 extracellular MSRV RNA in human plasma, although non- 
optimal for this application. Figure 47 illustrates the 
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results of PCR amplification of cDNA derived from 2 MS 
patient and 2 control plasma samples tested in parallel 
with cDNA from the sucrose density gradient fractions of 
an MS choroid plexus isolate. Taq-sequencing of the 700 bp 
5 bands confirmed the presence of MSRV sequence. A very 
faint 700 bp band is also visible in fraction 10 which 
corresponds to the bottom of the tube where aggregated 
particles usually sediment. Control RT-PCR for cellular 
aldolase transcripts on plasma-derived RNA was negative, 
indicating that the results were not due to cellular RNA 
released by cell lysis during plasma separation. It should 
be noted that this PCR technique was not designed for 
epidemiological studies since its sensitivity is impaired 
by the length of the cDNA required (1.15 kb) . 

Non degenerate primers amplifying three 
fragments of the pol gene (the whole protease region, 
regions A and B of the reverse transcriptase; Cf. Fig. 46) 
were also used to confirm the presence of MSRV sequences 
in DNase-treated RNA from MS plasma. These fragments were 
amplified from the plasma of a further 4 MS patients with 
active disease. Sequence analysis confirmed that the PRT 
and RT regions were homologous (>95% and >90% 
respectively) to MSRV sequences previously obtained on 
culture virion. No such sequence were detected in plasma 
from healthy controls (n=4) , tested in parallel with MS 
plasma. 
3) DISCUSSION 
- Phylogeny of MSRV 

From the results of this study, it can be 
concluded that the virus previously referred to as ,, LM7" 
(3, 5, 26) posseses an RNA genome containing the MSRV pol 
sequences described here. 

The conserved RT motif of both MSRV. and ERV9 is two amino 
acids shorter than that of other retroviruses, apart from 
human foamy viruses which nonetheless have a functional 
RT. The potential ORF encompassing the entire PRT-RT 
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region is consistent with the virion-associated RT- 
activity detected in sucrose density gradients with 
infected culture supernatants. Moreover, since we have 
recently succeeded in expressing a recombinant protein 
5 from the sequence of MSRV protease cloned from MS plasma , 
we can confirm the reality of the potential PRT ORF. 
Similar cloning and expression of other sequences 
containing potential ORFs for MSRV proteins, is being 
undertaken to confirm their ability to encode enzymes and 

10 structural proteins of MSRV virions. 

The phylogenic tree in Figure 45, based on the most 
conserved amino acid sequence in retroviruses 
(VLPQG. . . YXDD) , shows that the MSRV pol gene is related to 
the C-type oncoviruses. Apart from ERV9, the closest known 

15 retroviral element is RTLV-H, a human endogenous sequence 
known to have a subtype with a functional pol gene (32). 
In the pol region, this phylogenic affiliation to C-type 
oncoviruses apparently contradicts our previous 
assumptions based on the general morphology of the 

20 particles observed by electron microscopy (EM) , which were 
compatible with a B or D-type oncovirus (3, 5, 26). 
However, preliminary data on env sequences detected in 
MSRV virions, would suggest a greater phylogenic proximity 
to D-type. Such difference in phylogenies of the pol and 

25 env genes have been described in MPMV and suggest a 
recombinatorial origin in D-type retroviruses (33) . D to C 
type morphological conversion is also possible since it 
has been reported that a single amino acid substitution in 
the gag protein can convert retrovirus morphology to that 

30 of a different type (34). 

- Is MSRV an exogenous retrovirus sharing extensive 
homology with a related endogenous retrovirus family or an 
endogenous retrovirus producing extracellular virions? 

Southern blot analysis with an MSRV pol probe 

35 under stringent conditions, showed hybridisation with a 
multicopy endogenous family (data not presented) , 
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indicating the existence of endogenous elements more 
closely related to MSRV than ERV9 itself. Consequently, we 
were unable to look for a vir ion-specific provirus in 
MSRV-producing cells. In agreement with southern blot 
5 findings, PCR studies on genomic DNA showed multiple band 
amplification of MSRV-related endogenous sequences. Since 
pol is the most conserved retroviral gene, the sequence 
described here is the least suitable region to 
discriminate between exogenous and endogenous sequences. 

10 It is hoped that sequence information from other parts of 
the genome may permit such a discrimination, would it be 
on a tiny portion as has recently been demonstrated for 
the Jaagsiekte retrovirus (JSRV) of sheep (35). With such 
sequence data, it would then become possible to identify 

15 the MSRV-specif ic provirus in the genome of vir ion- 
producing cell cultures. 

MSRV could represent a virion-producing exogenous member 
of an ERV9-like endogenous family, just as exogenous 
strains exist in the well-studied mouse mammary tumour 

20 virus (MMTV) and murine leukaemia virus (MuLV) retroviral 
families of mice, and also, in the JSRV retroviral family 
of sheep (36). Alternatively, it is also conceivable that 
the extracellular MSRV virions may be produced by a 
replication-competent endogenous provirus. Wether MSRV is 

25 exogenous or endogenous, conceptual similarities exist 
with the category of retroviruses represented by MuLV, 
MMTV and JSRV. Unlike defective endogenous elements, this 
category of agents are known to produce infectious and 
pathogenic virions, to cause neurological disease (37), 

30 solid tumours / leukaemias (36, 38) and to express 
"endogenous superantigens" (39, 40) . Furthermore, in MuLV 
infections, the genetic endogenous retroviral background 
of the mouse strain can determine susceptibility or 
resistance to disease (39, 41). Indeed, such interactions 

35 between an infectious retrovirus and its endogenous 
counterpart may be relevant in the pathogenesis of MS, 
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since endogenous retroviral genotypes are not identical in 
all individuals. A genetic control due to related 
endogenous retroviral genotypes could therefore contribute 
to the known hereditary susceptibility to MS (43), if MSRV 
5 does indeed play an active role in this disease. 

Elsewhere, the data in Table 5 suggest that ERV9 elements 
may be co-expressed, possibly via trans-activation in 
infected cells, and give rise to heterologous RNA 
packaging in MSRV virions. Such heterologous packaging is 
10 known to occur in other retroviral systems (42). 

- A role for the numerous common viruses previously evoked 
in MS ? 

Among the numerous reports of viruses putatively 
involved in the aetiopathogenesis of MS, a significant 

15 proportion focus on two viral families, the 
paramyxoviridae and the herpesviridae. Regarding the 
paramyxoviridae, the key observation is of a frequently 
increased antibody titer to measles virus in MS patients 
essentially directed, in CSF, against measles fusion 

20 protein (44) . The existence of aminoacid similarities 
between conserved domains of the fusion proteins of 
paramyxoviridae and the transmembrane protein of 
retroviruses (45) , may explain this observation if 
antigenic cross-reactivity between these two proteins 

25 occur ed. 

With regard to the herpesvirus family, the involvement of 
Epstein-Barr Virus (EBV) , Herpes Simplex Virus type 1 
(HSV-1) and, most recently, Human Herpes Virus 6 (HHV-6) 
has been proposed (31, 46, 47). From our previous studies 

30 and from those of other groups, it appears that 
herpesviruses may play an important role in MSRV 
expression: we have shown that HSV-1 immediate-early ICPO 
and ICP4 proteins can transactivate MSRV/LM7 in vitro (6) 
and Haahr et al. have proposed an important 

35 epidemiological role for EBV, as a co-factor in MS, 
triggering retrovirus reactivation (31) . The recent 
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description by Challoner et al. (47) showing significant 
expression of HHV6 proteins in MS plaques may also suggest 
a similar role for HHV6 in the brain* 

5 EXAMPLE 19 : MSRV GENOME DETECTION TECHNIQUE 

Following 0.4 /xm filtration to remove cellular 
debris and RNase digestion to remove residual non- 
encapsidated RNA, serum was processed to extract viral RNA 
by means of adsorption to a silica matrix. Viral RNA was 
subjected to DNase digestion, then a combined reverse 
transcription-PCR (RT-PCR) reaction was performed using 
primers PTpol-A (sense: 5'xxxx3', SEQ ID NO:183) and 
PTpol-F (antisense: S'xxxxS', SEQ ID N0:184). A second 
round of amplification with nested primers PTpol-B (sense: 
5 , xxxx3', SEQ ID NO: 185) and PTpol-E (antisense: 5 , xxxx3», 
SEQ ID NO: 186) generated a 435 bp PCR product which was 
identified by gel electrophoresis. The specificity of each 
product was confirmed by dideoxy sequencing. Control 
reactions without reverse transcriptase were performed to 
ensure that the products were derived from viral RNA. In 
addition, to exclude the possibility that the extracted 
viral RNA might be contaminated with host cell derived 
nucleic acids, aliquots were tested by nested PCR for the 
presence of pyruvate dehydrogenase (PDH) DNA and RNA. 
Samples which generated a signal in either the PDH or the 
"no-RT" PCR assays were excluded from the analysis. 

Sera from patients with clinically active MS and 
controls were amplified by RT-PCR and sequenced. Virion 
associated MSRV-RNA was detected in the serum of 10 of 19 
(53%) patients with MS but in only 3 of 44 controls 
without MS (P=0.0001). The control group consisted of 8 
patients (all MSRV-RNA negative) with rheumatological 
disorders and 36 healthy adults. MSRV-RNA titres in both 
MS patients and controls were apparently low because even 
moderate dilution of sera (<10 fold) caused loss of 
signal. 
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In MS patients, detection of MSRV-RNA was not 
associated with age, sex, disease duration, or MS type, 
however a significant negative correlation with treatment 
was observed. 2 6 serum samples were obtained from the 19 
5 patients ; 100% of the sera from untreated patients 
contained detectable MSRV-RNA whereas it was detectable in 
only 4 of 19 samples (21%) obtained during treatment with 
corticosteroids and/or azathioprine (P=0.001). 

The reason for the apparent loss of virion 
10 associated MSRV-RNA during immunosupressive treatment is 
unknown but the finding is in agreement with the previous 
observations on the detection of MSRV in cerebrospinal 
fluid. 

15 TABLE 7 

DETECTION OF VIRION ASSOCIATED MSRV-RNA IN MS UNTREATED 

PATIENTS & CONTROLS 





Positive 


Negative 


Total 


% Positive 


Controls without MS a 




41 


44 


7% 












MS sera untreated at 
time of sampling 


7 


0 


7 


100% 



20 a The control group consisted of 8 patients with 
miscellaneous non-MS disorders and 36 healthy adults. 
b The detection of MSRV RNA in plasma of a few controls in 
conditions which select vir ion-packaged RNA, is consistent 
with the knowledge that a virus associated with MS should 

25 be present in a minor proportion of apparently healthy 
population. Indeed, such individuals can be either healthy 
carriers or be in the pre-clinical (or sub-clinical) phase 
of the disease which can last for years. 

30 
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METHOD : 

- Modified SNAP RNA extraction with filtration and RNase 
digestion 

(All centrifugations are at room temperature) 
5 Up to 500 microlitres of serum is filtered using 

0.45 micron spin filters (Nanosep MF from Flowgen 
Catalogue No. U3-0126 Ref . 0DM45) . The serum is spun for 
5 min at 130,000 g (or for further 10 min if necessary). 

150 microlitres of filtered serum is incubated 

10 with 10 units RNase One (Promega Catalogue NO.M4261) for 
30 min at 37°C. 

The 150 microlitres was then extracted using the 
SNAP RNA extraction kit (Invitrogen) as below: 

- 10 micrograms of poly A RNA was added to the 

15 450 microlitres of Binding Buffer to act as a carrier ; 
this was then added to the serum and mixed by inversion 6 
times ; 300 microlitres of propan-2-ol was then added and 
mixed by inversion 10 times ; 500 microlitres was 
transferred to the SNAP column and spun at 13 00 g for 

20 1 min and the flow- through discarded ; the remainder was 
then added to the SNAP column and spun at 1300 g for 1 min 
and the flow-through discarded ; the column was then 
washed with 600 microlitres of Super wash and the flow- 
through discarded ; the column was then washed with 600 

25 microlitres of lx RNA wash and the flow-through 
discarded ; this wash was repeated with a 2 min 1300 g 
spin and the flow-through discarded ; the bound nucleic 
acid was then eluted by incubating with 135 microlitres of 
RNase free water for 5 min and spun at 1300 g for 1 min. 

30 - 15 microlitres of lOx DNAse buffer and 3 

microlitres (30 units) of DNase I, RNase free (Boehringer 
Mannheim Cat. No. 776 785) was added and incubated for 30 
min at 37 °C ; 450 microlitres of Binding Buffer was added 
and mixed by inversion 6 times ; 3 00 microlitres of 

35 propan-2-ol was then added and mixed by inversion 10 
times ; 500 microlitres was transferred to the SNAP column 
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and spun at 1300 g for 1 min and the flow- through 
discarded ; the remainder was then added to the SNAP 
column and spun at 1300 g for 1 min and the flow-through 
discarded ; the column was then washed with 600 
5 microlitres lx RNA wash and the flow-through discarded ; 
this wash was repeated with a 2 min 1300 g spin and the 
flow-through discarded ; the bound nucleic acid was then 
eluted by incubating with 105 microlitres of RNase free 
water for 5 min and spun at 1300 g for 1 min. 

10 

- Titan RT-PCR 

RT-PCR was performed using the Titan one tube RT- 
PCR system (Boehringer Mannheim Cat, No. 1 855 476) 25 
microlitres of RNA was used in the combined RT-PCR 

15 reaction. The total reaction volume was 50 microlitres. 
Promega rRNAsin (10 units) was the RNase inhibitor used. 
170 ng of primers SEQ ID NO: 183 and SEQ ID NO : 184 , 
respectively, were used. A single master mix was prepared 
and the sample RNA added last. This was performed at room 

20 temperature, not on ice. 

The RT step consisted of two sequential 30 min 
incubations at 50°C and then 60°C. This was immediately 
followed by the PCR which had the following steps. 

* Initial denaturation of template at 94 °C for 2 min, 

25 * 40 cycles of 94 °C for 30 seconds ; 60°C for 30 seconds ; 
68 °C for 45 seconds, 

* 1 cycle of 68 °C for 7 min. 

The second round PCR was performed using the 
Expand long template PCR system (Boehringer Mannheim Cat. 
30 No. 1681 842). 0.5 microlitres of the RT-PCR mix was added 
to 25 microlitres of the round 2 PCR mix. Buffer No. 3 and 
50 ng of primers B and E were used. The PCR had the 
following steps: 

* 5 cycles of 94°C for 30 seconds, 60°C for 30 seconds., 
35 68 °C for 45 seconds, 

* 1 cycle of 68 °C for 7 min. 
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The PCR products were then run on a 2% agarose 

gel. 

The no RT controls were performed using "Expand" 
PCR system for both rounds. The first round was 40 cycles 
5 and the second round 20 cycles. 

As a positive control a DNA dilution series was 
used in both the RT-PCR and the "no RT" PCR. For a result 
to be valid the RT-PCR and "no-RT" PCRs had to have 
detected DNA equivalent to between 1 and 0.1 cells. 
10 The analysis of PCR products of an approximately 

435 bp fragment in the pol region is shown in Table 8. 

TABLE 8 

ANALYSIS OF PCR PRODUCTS WITH ORF * 

15 



Exp Disease 


clone 


ORF 


Fragment (bp) 


AA-RT Motif Site 


46-7 MS 


1 


+ 


429 


YGDD 




5 


+ 


429 


YGDD 




8 


+ 


429 


YGDD 


68-1 MS 


41 


+ 


438 


YMDD 




42 


+ 


438 


YMDD 




43 


+ 


438 


YMDD 


* Defective 


RNA 


can also 


be present 


in circulating 



virions, since the fidelity of the MSRV reverse 
transcriptase appears to be low and since recombination 
events with related endogenous elements can occur. It is 
then obvious that the intra- and inter- patients 
30 variability can be greater than that illustrated in this 
example, because of these encapsidated defective MSRV RNA 
copies. 

Table 9 which data have been determined from the 
35 alignments of Figures 49 to 53, shows a variability : 
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- between the clones obtained from the same patient plasma 
sample in the same PCR amplification experiment ; this 
means that the patient possesses a virion population which 
comprises different MSRV variants at a given time, 
5 - between the sequenced variant populations from different 
patients ; this means that the variants differ from a 
patient to another patient. 

TABLE 9 

10 Degree of identity (percentage) between nucleotide 

sequences and between peptide sequences, 
by direct comparison of said sequences (see Figures 49-53) 



Patient 


68-1 


46-7 


Nucleotide 
sequences 


between SEQ ID NO: 169 

and MSRV-pol (SEQ ID NO:l) 

90,4 % b 


between SEQ ID NO: 176 

and MSRV-pol (SEQ ID NO:l) 

82,5 % a 




92,3 % a 


84 % b 




SEQ ID NOs:170, 171, 
172 between them 

98,6 % b 


SEQ ID NOs:177, 178, 
179 between them 

94,5 % a 




98,7 % a 


95,1 % b 


Peptide 
sequences 


between SEQ ID NOs:173, 
174, 175 and SEQ ID NO: 
81 % 


between SEQ ID NOs:180, 
181, 182 and SEQ ID NO: 
73,5 % 




SEQ ID NOs:173, 174, 175 
between them 

97 % 


SEQ ID NOs:180, 181, 182 
between them 

89 % 



15 a) this percentage is determined on the basis of sequences 
excluding the primers 

b) this percentage is determined on the basis of sequences 
including the primers, 

20 From Figures 53A and 53B, the variability between tested 
patients sequences can be determined : 
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- between SEQ ID NO: 169 and SEQ ID NO: 176 : 16,5 % a and 
14 , 8 % b 

- between the peptide sequences obtained from 
SEQ ID NO: 169 and SEQ ID NO: 176 : 20 %• 

5 

Four microorganisms are mentioned in the 
specification page 3 lines 15-26 and they are identified 
below. They have all been deposited with the ECACC*, in 
accordance with the provisions of the Budapest Treaty. 

10 

- LM7PC deposited on 22nd July 1992 under No. 92072201, 

- PLI-2 deposited on 8th January 1993 under No. 93010817, 

- POL-2 deposited on 22nd July 1992 under No. V92072202, 
and 

15 - MS7PG deposited on 8th January 1993 under No. V93010816. 

* ECACC : European Collection of Animal Cell Cultures 
Vaccine Research and Production Laboratory 
Public Health Laboratory Service 
20 Centre of Applied Microbiology and Research 

Porton Down 

Salisbury, Wiltshire SP4 OJG 
United Kingdom 

25 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: BIO MERIEUX 



(ii) TITLE OF THE INVENTION: VIRAL MATERIAL AND NUCLEOTIDE 
FRAGMENTS ASSOCIATED WITH MULTIPLE SCLEROSIS , FOR DIAGNOSTIC, 
10 PROPHYLACTIC AND THERAPEUTIC PURPOSES 



(iii) NUMBER OF SEQUENCES: 160 

15 (iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: CABINET GERMAIN & MAUREAU 

(B) STREET: 12 rue Boileau 

(C) CITY: LYON 

(D) COUNTRY: FRANCE 
20 (E) ZIP: 69006 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

25 (C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 
(A) APPLICATION NUMBER: 
3 0 (B) FILING DATE: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Dominique GUERRE 

(B) REGISTRATION NUMBER: 

35 (C) REFERENCE/ DOCKET NUMBER: MD/B05B2679 
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(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 4 72 69 84 30 

(B) TELEFAX: 4 72 69 84 31 

5 (2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1158 base pairs 

(B) TYPE: nucleotide 

10 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 





CCCTTTGCCA 


CTACATCAAT 


TTTAGGAGTA 


AGGAAACCCA 


ACGGACAGTG 


GAGGTTAGTG 


60 




CAAGAACTCA 


GGATTATCAA 


TGAGGCTGTT 


GTTCCTCTAT 


ACCCAGCTGT 


ACCTAACCCT 


120 




TATACAGTGC 


TTTCCCAAAT 


ACCAGAGGAA 


GCAGAGTGGT 


TTACAGTCCT 


GGACCTTAAG 


180 


20 


GATGCCTTTT 


TCTGCATCCC 


TGTACGTCCT 


GACTCTCAAT 


TCTTGTTTGC 


CTTTGAAGAT 


240 




CCTTTGAACC 


CAACGTCTCA 


ACTCACCTGG 


ACTGTTTTAC 


CCCAAGGGTT 


CAGGGATAGC 


300 




CCCCATCTAT 


TTGGCCAGGC 


ATTAGCCCAA 


GACTTGAGTC 


AATTCTCATA 


CCTGGACACT 


360 




CTTGTCCTTC 


AGTACATGGA 


TGATTTACTT 


TTAGTCGCCC 


GTTCAGAAAC 


CTTGTGCCAT 


420 




CAAGCCACCC 


AAGAACTCTT 


AACTTTCCTC 


ACTACCTGTG 


GCTACAAGGT 


TTCCAAACCA 


480 


25 


AAGGCTCGGC 


TCTGCTCACA 


GGAGATTAGA 


TACTNAGGGC 


TAAAATTATC 


CAAAGGCACC 


540 




AGGGCCCTCA 


GTGAGGAACG 


TATCCAGCCT 


ATACTGGCTT 


ATCCTCATCC 


CAAAACCCTA 


600 




AAGCAACTAA 


GAGGGTTCCT 


TGGCATAACA 


GGTTTCTGCC 


GAAAACAGAT 


TCCCAGGTAC 


660 




ASCCCAATAG 


CCAGACCATT 


ATATACACTA 


ATTANGGAAA 


CTCAGAAAGC 


CAATACCTAT 


720 




TTAGTAAGAT 


GGACACCTAC 


AGAAGTGGCT 


TTCCAGGCCC 


TAAAGAAGGC 


CCTAACCCAA 


780 


30 


GCCCCAGTGT 


TCAGCTTGCC 


AACAGGGCAA 


GATTTTTCTT 


TATATGCCAC 


AGAAAAAACA 


840 




GGAATAGCTC 


TAGGAGTCCT 


TACGCAGGTC 


TCAGGGATGA 


GCTTGCAACC 


CGTGGTATAC 


900 




CTGAGTAAGG 


AAATTGATGT 


AGTGGCAAAG 


GGTTGGCCTC 


ATNGTTTATG 


GGTAATGGNG 


960 




GCAGTAGCAG 


TCTNAGTATC 


TGAAGCAGTT 


AAAATAATAC 


AGGGAAGAGA 


TCTTNCTGTG 


1020 




TGGACATCTC 


ATGATGTGAA 


CGGCATACTC 


ACTGCTAAAG 


GAGACTTGTG 


GTTGTCAGAC 


1080 


35 


AACCATTTAC 


TTAANTATCA 


GGCTCTATTA 


CTTGAAGAGC 


CAGTGCTGNG 


ACTGCGCACT 


1140 




TGTGCAACTC 


TTAAACCC 










1158 
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(2) INFORMATION FOR SEQ ID NO: 2: 



(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 297 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



10 (ii) MOLECULE TYPE: cDNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



CCCTTTGCCA CTACATCAAT TTTAGGAGTA 
15 CAAGAACTCA GGATTATCAA TGAGGCTGTT 
TATACAGTGC TTTCCCAAAT ACCAGAGGAA 
GATGCCTTTT TCTGCATCCC TGTACGTCCT 
CCTTTGAACC CAACGTCTCA ACTCACCTGG 



AGGAAACCCA ACGGACAGTG GAGGTTAGTG 60 

GTTCCTCTAT ACCCAGCTGT ACCTAACCCT 120 

GCAGAGTGGT TTACAGTCCT GGACCTTAAG 180 

GACTCTCAAT TCTTGTTTGC CTTTGAAGAT 240 

ACTGTTTTAC CCCAAGGGTT CAAGGGA 297 



20 

(2) INFORMATION FOR SEQ ID NO: 3: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 85 base pairs 
25 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: CDNA 



30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

GTTTAGGGAT ANCCCTCATC TCTTTGGTCA GGTACTGGCC CAAGATCTAG GCCACTTCTC 60 
AGGTCCAGSN ACTCTGTYCC TTCAG 85 



35 
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(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 86 base pairs 
5 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

GTTCAGGGAT AGCCCCCATC TATTTGGCCA GGCACTAGCT CAATACTTGA GCCAGTTCTC 60 
ATACCTGGAC AYTCTYGTCC TTCGGT 86 

15 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 85 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

GTTCARRGAT AGCCCCCATC TATTTGGCCW RGYATTAGCC CAAGACTTGA GYCAATTCTC 60 
30 ATACCTGGAC ACTCTTGTCC TTYRG 85 

(2) INFORMATION FOR SEQ ID NO: 6: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 85 base pairs 
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(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

GTTCAGGGAT AGCTCCCATC TATTTGGCCT GGCATTAACC CGAGACTTAA GCCAGTTCTY 60 
10 ATACGTGGAC ACTCTTGTCC TTTGG 85 



(2) INFORMATION FOR SEQ ID NO: 7: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 111 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 

(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



25 



GTGTTGCCAC AGGGGTTTAR RGATANCYCY CATCTMTTTG GYCWRGYAYT RRCYCRAKAY 60 
YTRRGYCAVT TCTYAKRYSY RGSNAYTCTB KYCCTTYRGT ACATGGATGA C 111 



30 (2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 645 base pairs 

(B) TYPE: nucleotide 

35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 



TCAGGGATAG 


CCCCCATCTA 


TTTGGCCAGG 


CATTAGCCCA 


AGACTTGAGT 


CAATTCTCAT 


60 


ACCTGGACAC 


TCTTGTCCTT 


CAGTACATGG 


ATGATTTACT 


TTTAGTCGCC 


CGTTCAGAAA 


120 


CCTTGTGCCA 


TCAAGCCACC 


CAAGAACTCT 


TAACTTTCCT 


CACTACCTGT 


GGCTACAAGG 


180 


TTTCCAAACC 


AAAGGCTCGG 


CTCTGCTCAC 


AGGAGATTAG 


ATACTNAGGG 


CTAAAATTAT 


240 


CCAAAGGCAC 


CAGGGCCCTC 


AGTGAGGAAC 


GTATCCAGCC 


TATACTGGCT 


TATCCTCATC 


300 


CCAAAACCCT 


AAAGCAACTA 


AGAGGGTTCC 


TTGGCATAAC 


AGGTTTCTGC 


CGAAAACAGA 


360 


TTCCCAGGTA 


CASCCCAATA 


GCCAGACCAT 


TATATACACT 


AATTANGGAA 


ACTCAGAAAG 


420 


CCAATACCTA 


TTTAGTAAGA 


TGGACACCTA 


CAGAAGTGGC 


TTTCCAGGCC 


CTAAAGAAGG 


480 


CCCTAACCCA 


AGCCCCAGTG 


TTCAGCTTGC 


CAACAGGGCA 


AGATTTTTCT 


TTATATGCCA 


540 


CAGAAAAAAC 


AGGAATAGCT 


CTAGGAGTCC 


TTACGCAGGT 


CTCAGGGATG 


AGCTTGCAAC 


600 


CCGTGGTATA 


CCTGAGTAAG 


GAAATTGATG 


TAGTGGCAAA 


GGGTT 




645 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 741 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



CAAGCCACCC 


AAGAACTCTT 


AAATTTCCTC 


ACTACCTGTG 


GCTACAAGGT 


TTCCAAACCA 


60 


AAGGCTCAGC 


TCTGCTCACA 


GGAGATTAGA 


TACTTAGGGT 


TAAAATTATC 


CAAAGGCACC 


120 


AGGGGCCTCA 


GTGAGGAACG 


TATCCAGCCT 


ATACTGGGTT 


ATCCTCATCC 


CAAAACCCTA 


180 


AAGCAACTAA 


GAGGGTTCCT 


TAGCATGATC 


AGGTTTCTGC 


CGAAAACAAG 


ATTCCCAGGT 


240 


ACAACCAAAA 


TAGCCAGACC 


ATTATATACA 


CTAATTAAGG 


AAACTCAGAA 


AGCCAATACC 


300 


TATTTAGTAA 


GATGGACACC 


TAAACAGAAG 


GCTTTCCAGG 


CCCTAAAGAA 


GGCCCTAACC 


360 
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CAAGCCCCAG 


TGTTCAGCTT 


GCCAACAGGG 


CAAGATTTTT 


CTTTATATGG 


CACAGAAAAA 


420 


ACAGGAATCG 


CTCTAGGAGT 


CCTTACACAG 


GTCCGAGGGA 


TGAGCTTGCA 


ACCCGTGGCA 


480 


TACCTGAATA 


AGGAAATTGA 


TGTAGTGGCA 


AAGGGTTGGC 


CTCATNGTTT 


ATGGGTAATG 


540 


GNGGCAGTAG 


CAGTCTNAGT 


ATCTGAAGCA 


GTTAAAATAA 


TACAGGGAAG 


AGATCTTNCT 


600 


5 GTGTGGACAT 


CTCATGATGT 


GAACGGCATA 


CTCACTGCTA 


AAGGAGACTT 


GTGGTTGTCA 


660 


GACAACCATT 


TACTTAANTA 


TCAGGCTCTA 


TTACTTGAAG 


AGCCAGTGCT 


GNGACTGCGC 


720 


ACTTGTGCAA 


CTCTTAAACC 


C 








741 



10 (2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 93 base pairs 

(B) TYPE: nucleotide 

15 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

TGGAAAGTGT TGCCACAGGG CGCTGAAGCC TATCGCGTGC AGTTGCCGGA TGCCGCCTAT 60 
AGCCTCTACA TGGATGACAT CCTGCTGGCC TCC 93 

25 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 
30 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



5/9/2006, EAST Version: 2.0.3.0 



WO 98/23755 



PCT/IB97/01482 



129 



TTGGATCCAG TGYTGCCACA GGGCGCTGAA GCCTATCGCG TGCAGTTGCC GGATGCCGCC 60 
TATAGCCTCT ACGTGGATGA CCTSCTGAAG CTTGAG 96 

5 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 748 base pairs 
10 (B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: cDNA 

15 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



TGCAAGCTTC 


ACCGCTTGCT 


GGATGTAGGC 


CTCAGTACCG 


GNGTGCCCCG 


CGCGCTGTAG 


60 


TTCGATGTAG 


AAAGCGCCCG 


GAAACACGCG 


GGACCAATGC 


GTCGCCAGCT 


TGCGCGCCAG 


120 


CGCCTCGTTG 


CCATTGGCCA 


GCGCCACGCC 


GATATCACCC 


GCCATGGCGC 


CGGAGAGCGC 


180 


CAGCAGACCG 


GCGGCCAGCG 


GCGCATTCTC 


AACGCCGGGC 


TCGTCGAACC 


ATTCGGGGGC 


240 


GATTTCCGCA 


CGACCGCGAT 


GCTGGTTGGA 


GAGCCAGGCC 


CTGGCCAGCA 


ACTGGCACAG 


300 


GTTCAGGTAA 


CCCTGCTTGT 


CCCGCACCAA 


CAGCAGCAGG 


CGGGTCGGCT 


TGTCGCGCTC 


360 


GTCGTGATTG 


GTGATCCACA 


CGTCAGCCCC 


GACGATGGGC 


TTCACGCCCT 


TGCCACGCGC 


420 


TTCCTTGTAG 


ANGCGCACCA 


GCCCGAAGGC 


ATTGGCGAGA 


TCGGTCAGCG 


CCAAGGCGCC 


480 


CATGCCATCT 


TTGGCGGCAG 


CCTTGACGGC 


ATCGTCGAGA 


CGGACATTGC 


CATCGACGAC 


540 


GGAATATTCG 


GAGTGGAGAC 


GGAGGTGGAC 


GAAGCGCGGC 


GAATTCATCC 


GCGTATTGTA 


600 


ACGGGTGACA 


CCTTCCGCAA 


AGCATTCCGG 


ACGTGCCCGA 


TTGACCCGGA 


GCAACCCCGC 


660 


ACGGCTGCGC 


GGGCAGTTAT 


AATTTCGGCT 


TACGAATCAA 


CGGGTTACCC 


CAGGGCGCTG 


720 


AAGCCTATCG 


CGTGCAGTTG 


CCGGATGC 








748 



(2) INFORMATION FOR SEQ ID NO: 13: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 
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(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GCATCCGGCA ACTGCACG 

10 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
15 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GTAGTTCGAT GTAGAAAGCG 20 

25 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 
30 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
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GCATCCGGCA ACTGCACG 18 

5 (2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 23 base pairs 

(B) TYPE: nucleotide 

10 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

AGGAGTAAGG AAACCCAACG GAC 23 

20 (2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleotide 

25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

TAAGAGTTGC ACAAGTGCG 19 

35 (2) INFORMATION FOR SEQ ID NO: 18: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
5 <D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

10 

TCAGGGATAG CCCCCATCTA T 21 



(2) INFORMATION FOR SEQ ID NO: 19: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

25 

AACCCTTTGC CACTACATCA ATTT 24 



(2) INFORMATION FOR SEQ ID NO: 20: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 

(ix) FEATURES: 

(B) LOCATION: 5, 7, 10, 13 
5 (D) OTHER INFORMATION: G represents inosine (i) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

GGTCGTGCCG CAGGG 15 

10 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 21 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

TTAGGGATAG CCCTCATCTC T 21 

25 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 21 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
TCAGGGATAG CCCCCATCTA T 21 

5 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 
10 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 



15 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
AACCCTTTGC CACTACATCA ATTT 24 

20 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 23 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 (ii) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
GCGTAAGGAC TCCTAGAGCT ATT 23 

35 
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(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

5 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
TCATCCATGT ACCGAAGG 18 

15 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 
20 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

ATGGGGTTCC CAAGTTCCCT 20 



30 (2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleotide 

35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

5 

GCCGATATCA CCCGCCATGG 20 
(2) INFORMATION FOR SEQ ID NO: 28: 

10 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 



20 



GCATCCGGCA ACTGCACG 18 



(2) INFORMATION FOR SEQ ID NO: 29: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
30 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

35 

CGCGATGCTG GTTGGAGAGC 20 
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(2) INFORMATION FOR SEQ ID NO: 30: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

10 

(ii) MOLECULE TYPE: cDNA 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
15 TCTCCACTCC GAATATTCCG 20 

(2) INFORMATION FOR SEQ ID NO: 31: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
30 GATCTAGGCC ACTTCTCAGG TCCAGS 26 

(2) INFORMATION FOR SEQ ID NO: 32: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 
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(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: cDNA 

(ix) FEATURES: 

(B) LOCATION: 6, 12, 19 

(D) OTHER INFORMATION: G represents inosine <i) 
(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 32 

CATCTGTTTG GGCAGGCAGT AGC 23 



15 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

20 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
CTTGAGCCAG TTCTCATACC TGGA 24 

30 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 
35 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 
5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

AGTGYTRCCM CARGGCGCTG AA 22 

10 (2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleotide 

15 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

GMGGCCAGCA GSAKGTCATC CA 22 
(2) INFORMATION FOR SEQ ID NO: 36: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
30 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

35 

GGATGCCGCC TATAGCCTCT AC 22 
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(2) INFORMATION FOR SEQ ID NO: 37: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

10 

(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
15 AAGCCTATCG CGTGCAGTTG CC 22 

(2) INFORMATION FOR SEQ ID NO: 38: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleotide 

<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: cDNA 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
30 TAAAGATCTA GAATTCGGCT ATAGGCGGCA TCCGGCAAGT 40 

(2) INFORMATION FOR SEQ ID NO: 39 

35 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 50 amino acids 
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(B) TYPE : amino acid 
(ii) MOLECULE TYPE : peptide 

5 (xi) SEQUENCE DESCRIPTION : SEQ ID NO: 39 

Asp Ala Phe Phe Cys lie Pro Val Arg Pro Asp Ser Gin Phe Leu Phe 

15 10 15 

Ala Phe Glu Asp Pro Leu Asn Pro Thr Ser Gin Leu Thr Trp Thr Val 

20 25 30 

Leu Pro Gin Gly Phe Arg Asp Ser Pro His Leu Phe Gly Gin Ala Leu 
35 40 45 

Ala Gin 
50 

(2) INFORMATION FOR SEQ ID NO: 40 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH : 150 base pairs 

20 (B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE : cDNA 

25 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 40 

GATGCCTTTT TCTGCATCCC TGTACGTCCT GACTCTCAAT TCTTGTTTGC CTTTGAAGAT 60 
CCTTTGAACC CAACGTCTCA ACTCACCTGG ACTGTTTTAC CCCAAGGGTT CAGGGATAGC 120 
30 CCCCATCTAT TTGGCCAGGC ATTAGCCCAA 150 

(2) INFORMATION FOR SEQ ID NO: 41 

35 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 11 amino acids 
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(B) TYPE : amino acid 

(ii) MOLECULE TYPE : peptide 

5 <xi) SEQUENCE DESCRIPTION : SEQ ID NO: 41 

Cys lie Pro Val Arg Pro Asp Ser Gin Phe Leu 
15 10 



10 (2) INFORMATION FOR SEQ ID NO: 42 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 17 amino acids 

(B) TYPE : amino acid 

15 

(ii) MOLECULE TYPE : peptide 

<xi) SEQUENCE DESCRIPTION : SEQ ID NO: 42 

20 Val Leu Pro Gin Gly Phe Arg Asp Ser Pro His Leu Phe Gly Glu Ala 
15 10 15 

Leu 
17 

25 

(2) INFORMATION FOR SEQ ID NO: 43 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 8 amino acid 
30 (B) TYPE : amino acid 

(ii) MOLECULE TYPE : peptide 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 43 

35 

Leu Phe Ala Phe Glu Asp Pro Leu 
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(2) INFORMATION FOR SEQ ID NO: 44 

5 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 8 amino acids 

(B) TYPE : amino acid 

10 (ii) MOLECULE TYPE : peptide 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 44 

Phe Ala Phe Glu Asp Pro Leu Asn 
15 1 5 8 

(2) INFORMATION FOR SEQ ID NO: 45 

20 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 25 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

25 

(ii) MOLECULE TYPE : cDNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 45 
30 GTGCTGATTG GTGTATTTAC AATCC 25 

(2) INFORMATION FOR SEQ ID NO: 46 

35 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 1859 base pairs 
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(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

5 <ii) MOLECULE TYPE : CDNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 46 





GTGCTGATTG 


GTGTATTTAP 

VJ X VJ x n X X X n\< 


AATPCTTTAT 
An iwvti x x n x 


PTAATPPGAA 


ATGPPPATGT 
f» xvjv^v^v^nxvjx 


TGPAATATGG 
luwnn x t\ x vj vj 


fin 


10 

X. V/ 


AAAGAAAGGG 


AGTTCCTAAC 
*»vj x x v— ^ x nnv> 


CTCTGGGGGA 


ACCCCCATTA 
nv«v«v<VifV<ni x/* 


AATACCACAA 


G T AAA TC ATG 
vj x nnn x \*n x vj 


X £ Vj 




GAGTTATTGC 

Mrtui inl xvjv*> 


ACACAGTGCA 


AAAACTCAAG 
iwinv* x v^rvrivj 


GAGGTGGAAG 
unuu x vjvjrvrtvj 




PCAAAGPP.AT 
v^v^n/muv/vn X 


XOv 




CAGAAAAGGG 


AAGAGGGGAG 

<^ravm wvj\*nvj 


AAGAGCAGCA 


TAAGTGGCTA 
x cm\j x vjvj w x /» 


CAGAGGCAAG 


GAAAGACTAG 
un/vnunv< x nv> 


£*s vJ 




CAGAAAGGAA 


AGAGAGAAAG 


AGACAGAAAG 


TCAGAGAGAG 


AGAGAGGAAG 

rtvjrtvjfivjvjrvrivj 


AGACAGAGCA 
rv un v_r\vji-ivj v^n 


300 




CAAAGAGGGA 


GTPAGAGAGA 


GAGAGAGAPA 


GAGAGTPAGA 
vjrivjrivj x vnun 


GAGAAGGAAA 


GAGAGAGAGG 
vj ri vj r\ vj nunuu 




15 


A AG AG AP AAA 


GAATGAATPA 


AAPAGAGAGA 


PAGAAAGTPA 
v^ r\ vj /xrvfl vj x v»rt 


GAG AG AG AG A 


GAGAGAGGAA 


Haw 




GAGAPAftAGA 


A A A AflAGGGA 
rt/Vft/i unVj vjvj rt 


RTPAflAAAAA 


n ag Arc Apr* a a 

VjMvjf\Vj/^V^ V^fVM 


AG A AGAAGTP 
rVVjrV-ttvj/vrtvj X V*. 


PAAAHAflAAA 


HOW 




GAAAGAGAGA 


TGGAAGTAGT 
X vjvj rxr\vj X rlvj X 


AAAGGAAAAA 
AnnviunnnAn 


PAGTGTAPPP 
v^/» vj x vj x riv^v*Va> 


TATTPPTTTA 


AAAGPPGGGG 
nnrtv v»»v^vj\jvjvj 


RAO 




TAAATTTAAA 


APPTATAATT 


GATAACTGAA 

vin x nnu x unn 


GGTPTTPTPT 1 

VJVJ i. Ul X V* X Vw X 


GTAACCCTGT 

w x nnv«vv< x vj x 


AACACTCCAA 


600 




TACCACCTTG 


TTGTCAAGTG 
x x vj x unnu x vj 


TAAACAAGGG 


CGTAGCCCAA 

VsVJ X *»VJ V/V* won 


AAGCACTGAG 

nnvjv«nv# x vin<j 


GCCACTAACA 


660 


20 


ACCCATAGCC 

**w^ w w*» x nwww 


TTCCTATCAA 


AATTCCTTAA 


CCCAGCAGGT 

\+ w w*iw w*» w w 4 


TTCCTAACAG 


GGGATCTAAA 


720 




m^*mmn n mm it ^ 

TCTTAATTAA 


TTACCATACA 


ATGGTCCAAC 


CAGACTTAGG 


AGGAATTCCC 


TTCAGGACGG 


780 




GAAGATAGAT 


GCTTCCTCCC 


AGGCGATTAA 


GGGAGAAAGA 


CACAATGGGT 


ATTCAGTAAG 


840 




TGCCAAGGGG 


AACACTTGTA 


GAAGCAAAGT 


TAGGAAAATT 


GCCAAATAAT 


TGGTTTGCTC 


900 




AAGAGTTGTT 


TGCACTCAGC 


CAAACCTTGA 


AGTACTTGCA 


GAATCAGAAA 


GGAGCCATCT 


960 


25 


ATACCAATTC 


TAAGTTAATA 


TGGACTGAAG 


GAGGTTTTAT 


TAATACCAAA 


GAGAAATTAA 


1020 




AATCCCAAAC 


TTATAAGGTT 


TTCAACCAAA 


GTAAAGTTTG 


CTAAAAGTTA 


ACAGCGTAAC 


1080 




ATGTATTATC 


CTACTACCAC 


ACACTCTCAA 


AGGATTTCTC 


AGACAGTTTG 


CAAGAAATAA 


1140 




TGATATCTAT 


CCTTACTCTA 


CAATCCCAAA 


TAGACTCTTT 


GGCAGCAGTG 


ACTCTCCAAA 


1200 




ACCGTCAAGG 


CCTAGACCTC 


CTCACTGCTG 


AGAAAGGAGG 


ACTCTGCACC 


TTCTTAAGGG 


1260 


30 


AAGAGTGTTG 


TCTTTACACT 


AACCAGTCAG 


GGATAGTATG 


AGATGCTGCC 


CGGCATTTAC 


1320 




AGAAAAAGGC 


TTCTGAAATC 


AGACAACGCC 


TTTCAAATTC 


CTATACCAAC 


CTCTGGAGTT 


1380 




GGGCAACATG 


GTTTCTTCCC 


TTTCTATGTC 


CCATGGCTGC 


CATCTTGCTA 


TTACTCGCCT 


1440 




TTGGGCCCTG 


TATTTTTAAC 


CTCCTTGTCA 


AATTTGTTTC 


TTCTAGGATC 


GAGGCCATCA 


1500 




AGCTACAGAT 


GGTCTTACAA 


ATGGAACCCC 


AAATGAGCTC 


AACTATCAAC 


TTCTACTGAG 


1560 


35 


GACCCCTAGA 


CCAACCCCCT 


GGCCCTTTCA 


CTGGCCTAAA 


GAGTTCCCCT 


CTGGAGGACA 


1620 




CTACCACTGC 


AGGGCCCCAT 


CTTTGCCCCT 


ATCCAGAAGG 


AAGTAGCTAG 


AGCAGTCATT 


1680 
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GCCCAATTCC CAAGAGCAGC TGGGGTGTCC CGTTTAGAGT GGGGATTGAG AGGTGAAGCC 1740 
AGCTGGACTT CTGGGTCGGG TGGGGACTTG GAGAACTTTT GTGTCTAGCT AAAGGATTGT 1800 
AAATGCAACA ATCAGTGCTC TGTGTCTAGC TAAAGGATTG TAAATACACC AATCAGCAC 1859 



(2) INFORMATION FOR SEQ ID NOi 47 

(i) SEQUENCE CHARACTERISTICS : 
10 (A) LENGTH : 23 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

15 (ii) MOLECULE TYPE : cDNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 47 
TGATGTGAAC GGCATACTCA CTG 23 

20 

(2) INFORMATION FOR SEQ ID NO: 48 

(i) SEQUENCE CHARACTERISTICS : 
25 (A) LENGTH : 24 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

30 (ii) MOLECULE TYPE : cDNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 48 
CCCAGAGGTT AGGAACTCCC TTTC 24 

35 
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(2) INFORMATION FOR SEQ ID NO: 49 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 25 base pairs 

5 (B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

<D) TOPOLOGY : linear 

(ii) MOLECULE TYPE : cDNA 

10 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 49 
GCTAAAGGAG ACTTGTGGTT GTCAG 25 

15 

(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

20 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 
CAACATGGGC ATTTCGGATT AG 22 

30 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 400 base pairs 
35 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 
5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 



GGCTGCTAAA 


GGAGACTTGT 


GGTTGTCAGA 


CAATCGCCTA 


CTTAGGTACC 


AGGCCTTATT 


60 


ACTTGAGGGA 


CTGGTGCTTC 


AGATGCGCAC 


TTGTGCAGCT 


CTTAACCCAA 


ACTTATGCTG 


120 


CCCAGAAGGA 


TCTTTTAGAG 


GTCCCCTTAG 


CCAACCCTGA 


CCTCAACCTA 


TATATATACT 


180 


GATGGAAGTT 


CGTTTGTAGA 


AAAGGGATTA 


CAAAGGGNAG 


GATATNCCAT 


AGGTTAGTGA 


240 


TAAAGCAGTA 


CTTGAAAGTA 


AGCCTCTTCC 


CCCCAGGGAC 


CAGCGCCCCC 


GTTAGCAGAA 


300 


CTAGTGGCAC 


TGACCCCGAG 


CCTTAGAACT 


TGGAAAGGGA 


GGAGGATAAA 


TGTGTATACA 


360 


GATAGCAAGT 


ATGCTTATCT 


AATCCGAAAT 


GCCCATGTTG 






400 



15 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2389 base pairs 

20 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 



TCAGGGATAG 


CCCCCATCTA 


TTTGGTCAGG 


CACTGGCCCA 


AGATCTAGGG 


ACATGCCACT 


60 


TTTAAGAGCC 


ATTTCTCAAG 


TCCAGGTACT 


CTGGTCCTTC 


GGTATGTGGA 


TG AT T TACT T 


120 


TTGGCTACCA 


GTTCAGTAGC 


CTCATGCCAG 


CAGGCTACTC 


TAGATCTCTT 


GAACTTTCTA 


180 


GCTAATCAAG 


GGTACAAGGC 


ATCTAGGTTG 


AAGGCCCAGC 


TTTGCCTACA 


GCAGGTCAAA 


240 


TATCTAGGCC 


TAATCTTAGC 


CAGAGGGACC 


AGGGCACTCA 


GCAAGGAACA 


AATACAGCCT 


300 


ATACTGGCTT 


ATCCTCACCC 


TAAGACATTA 


AAACAGTTGC 


GGGGGTTCCT 


TGGAATCACT 


360 


GGCTTTTTGG 


TGACTATGGA 


TTCCCAGATA 


CAGCAAGATT 


GGCAGGCCCC 


TCTATACTGT 


420 


AATCAAGGAG 


ACTCACGAGG 


GCAAGTACTC 


ATCTAGTAGA 


ATGGGAACTA 


GGGACAGAAA 


480 


CAGCCTTCAA 


AACCTTAAAG 


CAGGCCCTAG 


TACAATCTCC 


AGCTTTAAGC 


CTTCCCACAG 


540 
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oHL-rtnAftu X 1 


UlL>i X irtlrtL 


ATP* A PAH AH A 


PPPPAPAPAT 


APPTPTTPPT 


GTCCTTATTC 


Ann 


nbftv, 1 Ln 1 wljr 


p A PT A P PT P A 


paappacTnn 


PAPAPPTA AC 


TAAPP A A ATT 
lAAuViAAAl 1 


p a tptiptsp 
v» A X v» 1 A\» 1 Av» 


can 


paa aappptp 


GPPTP1PTP.T 


TT A TCRfiTA P 

X X n X \»OVJ X nv» 


\- X Vj X VJU X liw x 


PPPTPTPTTA 


PTPTPAP A AP 
wlul UAuAAb 




PTATPA AA AT 


A STSPA APP A 


A APP ATPTP A 
nnuun x \* x v*n 


PTPTPTPPAP 


TAPTPATP AT 
X nL X CH X X 


p t a a tp p p a t 
blAAlbULAl 


/ oU 


aptapptppp 


A A A AC A ap.TT 


TATGGGTATP 


APAPAAPPAP 


PTPPTTAPAT 


APPAPPPAPT 
AuUAwuuAU X 


o**u 


1 V^L* X Vjvnu 


P A TTP.P.P.PTT 


PAAPTPPPTT 


TTTTfiTP-P-PP 

XXX XVf luuVfU 


TP A APPPTPP 


pa p^p^h^p^pp^pt 

tnL X X X X X 




CCAGAfiCATC 


CACACrPCPT 


TGAGCATGCT 


TGCCAAPAGG 


TTGTAPPPPA 
x xvxnvui~^~n 


PAATTATTPP 
unn X XnX X l»o 


960 


ACCCPAPATP. 


ATPTPTTACA 


GTACCCTTAG 


CTAATPPTGA 


PPTTAAPPTA 
v*\* x x nnv^v^ x n 


TATAPPAATP 
xnx n^vnn x w 


1020 


unnu x x v^n x x 


TRTdRAAAAP 


PGP AT ATP AA 
uviun x n x unn 


GGGCAGGTTA 
vuu^nuvj x x n 


TCTPATACTT 
iu i ^n x nvj x x 


APTPATPTAA 
nu X vrn iui nn 


i nan 


TPATAPTTfiP 


AAPTAAPPPT 


PTTACCCCAG 


GCGPPAfiPAP 


TPAPTTAPPA 
x v»nu x x nv7«-»n 


PA APTAPTPA 


X x**u 


PHPTT1PPTT 
vnUl X nUl> J. x 


A APPTTAPA A 


PTP.RRAAARP. 


G AAA A AH A AT 


A A ATATPTAT 
nnn inivjinl 


APAPATAPTA 
nUnun X n w X n 


i onn 

i/uu 


APTATPPTTA 
nv lniu^l In 


TPTAATPPTA 


PATPPPPATP 


PTPPA ATATP 
v- X u^nn Inly 


PAAPPAAAPP 


P APTTPPTA A 


1 Ofin 


VvCCC X VjljljOO 


aapppppaTT 


A A AT APP AP A 
nnn X CnCn 


APPV A A ATP A 
nuu I nnn X \~t\ 


TPP UPTTfiTT 
1 uunu 1 Xnl X 


pp a nn r % TLr"T t n 


1 *aon 


pftRiaikPTpa 


apnapcTP.pp 


APTPTTAPAP 


TPPPP A APPV 


ATP A AAAAPP 
n X onnnnnLrL» 


ppaappapap 


1 'aan 

lJOU 


pppapaapap 


PaftPlTlfifiT 
LhbLrt X nnu 1 


p.p.TTRp.papa 


RPp AP.TP AAA 
uuk/nu X v*nnn 


P APPAPPAP A 


papa appapa 

bAbAAbbAbA 


i a An 


PlkPlPHRPrT 
OnwnCnAlA* 1 


P a a PP fcPRPS 


a pp a a hrMir 1 
AbOAAAuAAb 


appappapap 
AbviAbuAuA^ 


apapappaap 
AwAvjnuuAAb 


apapapapap 
AbAbAbAbAb 


ibuu 


apapTTaPTP 
AtAO 1 I At» 1 C 


^r'^n^nis.n 
LAAvjAlrAbAb 


ALAbAbAbAu 


paapapapap 


ALAbAnA(j x C 


paapapapaa 
L.AAUAbiAt>AA 


lDbu 


pp. a iirB^ar 
vfennnonljnV? 


uAAuAuAu^A 


Auunu X LvPIA 


papapapaaa 


PAPATAP A AP 
V»nV»n X nunny 


Tap»ra a ap a a 


i con 


A a a apaTTPT 


ACCCTATTCC 


1 1 lnni\nljCC 


PPPPTiTfiTT 


X rVHnnoo Xnl 


MTTPJTl Rip 
nn 1 ibnlAAl 


i con 


TP* fiPTTPTTP 


L.n^Lt 1 CC 1 C 


CnVjVjV3V?nX lu 


nTnnn app a a 


a nnnTn'n a pp 


pithtp tp a a 
un XAlbl bnn 


i *7 An 
X / *u 


flUTTPTPPPT 
nnX lblViuui 


PPTPPPT1TP 


TPTP a 7\ r P r P a P 
X C X Cnn X X no 


pap pp nafap 


PPPPTTP r T T 'f ,r i* 
o X 1 o 1 X X 


TTAPTPTP A A 
X x n v 1 V7 X bAA 


i onn 


pp trrr 1 tp t a 


nnnnm^nt^n 
OACCfeCAl^AC 


aprp npiv ppt 


r^'Vn R a a TPP 
C 1 bALAA X CL- 


aT a pppt"ppp 
niAtlll lit 


t a tpp a a a aT 
Xnl LtnAAn 1 


i p An 
lOOU 


CCTTAACCCA 


GCACjCjI lilt 


T A a JV Ti n n n n A 

TAAAAGGGGA 


TCTAAATCTT 


AA1 1AAX I AC 


paTapaaapr* 
LA 1 AC AAAbu 




TCAAACCAGA 


TCTAGGAGGA 


ACTTCCTTCA 


GGACAGGATG 


ATAGATGGTT 


CCTCCCAGGC 


1980 


GATTAAAGAA 


aataaaaaga 


CACATGGGCA 


GCCAGTAAGT 


GATAAGGGAA 


CACTAGTAGA 


2040 


AGCAGTTAGG 


AGAAGTTGCC 


TAATAATTGG 


TCTACTCCAA 


ATGTGTGAGT 


TGTTCGCACT 


2100 


CAGCCCAAAT 


CTTAAAGTAC 


TTACAGAATT 


AGGGAGGAGC 


CATTTACACC 


AATTCTAAGT 


2160 


TAATATGGAC 


TGGATGAGGT 


TTTATTAATA 


GCGAAGGAGA 


ATTAAATCCT 


AAACTNACAA 


2220 


GGTTTTCAAC 


TAAAGTAAAT 


TTTACTAAAA 


GCTAACAGTG 


TAACATGCAT 


TATCCTACTA 


2280 


CAACACACTC 


TCANAGGATT 


CCTCAGACAG 


TTTACAAGAA 


ATAACAAAAT 


CTATCTGGTA 


2340 


AGGATAGTAA 


CTACAATCCC 


AAATACATTC 


TTTGGCAGCA 


GTGACTCTC 




2389 



(2) INFORMATION FOR SEQ ID NO: 53: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 2448 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 

<ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 



10 


TCAGGGATAG 


CCCCCATCTA 


TTTGATCAGG 


CACTAGCCCA 


AGATCTAGGC 


CACTTCTGAA 


60 




GTCCAGGCAT 


TCTAGTCCTT 


CAGTATGTGG 


ATGATTTACT 


TTTGGCTACC 


AGTTTGGAAG 


120 




CCTCATGCCA 


GCAGGCTACT 


TGAGATCTCT 


TGAACTTTCT 


AGCTAATCAA 


GGGTGTATGG 


180 




CATCTAAATT 


GAAAGTCCAG 


CTCTGCCTAC 


AACAAGTCAA 


ATATCTAGGC 


CTAATCTTAG 


240 




ATAGAAGAAC 


CAGGGCCCTC 


AGCAAGGAAT 


GAATAAAGCC 


TATGCTGGCT 


TATCGGCACC 


300 


15 


CTAAGACATT 


AAAACAATTG 


TGGGGGTTCC 


TTGGAATCAC 


TGGCTTTTGC 


CGACTATGGA 


360 




TCCCTGGATA 


GAGTGAGATA 


GCCAGGCCCC 


CTCTATTACT 


CTTATCAAGG 


AGACCCAGAG 


420 




GGCAAATACT 


TATCTAGTAT 


TATGGGNACC 


AGAGGCAGAA 


AAAGCCTTCC 


AAACCTTAAA 


480 




GGAGACCCTA 


GTACAAGCTC 


CAGCTTTAAG 


CCTTCCCACA 


GGACAAANCT 


TCTCTTTATA 


540 




TGTCACAGAG 


AGAGCAGGAA 


TAGCTCCTGG 


AGTCCTTACT 


CAGACTTTTG 


GACGACCCCA 


600 


on 

c> XJ 






1 f\f\\*\j f\nJ\ 1 1 


Vj A lul AO 1 oij 


C ft & b A CI CI PTC 


c rrTPurTr t 1 
V7l~l» 1 UnU 1 vj 1 


oou 




TTATGGGTAG 


TTGCGGCTGT 


GGCAGTCTTA 


CTGTCAAAGG 


CTATCAAAAT 


AATACAAGGA 


720 




AAGGATTTCA 


CTATCTGGAC 


TACTCATGAG 


GAAAATGGCA 


TATTAGGTGC 


CAAAGGAAGT 


780 




TTTTGGCTAT 


CAGACAACCA 


CCTGCTCAGA 


TTCCAGGCAC 


TACTGATTGA 


GAGACCAGTG 


840 




CTTTAAATAT 


GTATGTGTGT 


GTGTGGCCCT 


CAACCCTGCC 


ACTGTTCTCC 


CAGAAGATGG 


900 


25 


AGAACCAATG 


AAGCATTACT 


GTCAACAAAT 


TAGAGTCCAG 


AGTTATGCTG 


CCTGAGAGGA 


960 




TCTCTTAGAA 


GTCCCCTTAG 


CTAATCCTGA 


CCTTAACCTA 


TATGCTGATG 


GAAGTTCACT 


1020 




TGTGGAGAAT 


GGGATACGAA 


AAGCACATTA 


TGCCATAGTT 


AGTGAGGTAA 


CAGTACTTGA 


1080 




AAGTAAGCCT 


ATTCCCCCAT 


GGACCAGAGC 


CCAGTTAGCA 


GAACTAGTGG 


CACTTACCCA 


1140 




AGCCTTAGAA 


CTAGGAAAGG 


GAAAAATAAT 


AAATGTGTAT 


AC AG AT AG CA 


AGTATGCTTA 


1200 


30 


TCTAATCCTA 


CATGCCCATG 


CTGCAGTATG 


GAAAGAAAGG 


GAGTTCCTAA 


CCTCTGGGGG 


1260 




AACCCCCATT 


AAATACCACA 


AGGCAAATCA 


TGGAGTTATT 


GCATGTAGTG 


CAAAACCTCA 


1320 




AGTAGGTGGC 


AGTTTTACAC 


TGCCTGAAGC 


TATGGGGAAG 


GAGAGAGGAG 


AACAGCAGCA 


1380 




TAAGTGGCTA 


GCAGAGGCAG 


CGAAAGACTA 


GCAGAGAGGA 


GAGGTAGGGG 


AAAGACAGAA 


1440 




AGTCAAAGAA 


AAGAAGTCAA 


AGACAGACAG 


AGAAAGAGAC 


AGAGGGAGCC 


AGAGAGAAAG 


1500 


35 


AAAAGAGAGA 


ACGAAAGAGA 


CAGAATGTCA 


AAGAACAGAA 


GAGAGAGGCA 


GCG CCAGAAG 


1560 




AGTTAAGAAA 


GTGAGAAAGA 


GAGATGGAAA 


TAGTAAAGAA 


AAAACAGTGT 


ACCCTATTCC 


1620 
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TTTAAAAGCC AGGGTAAATT TAAAACGTAT AATTTTATAA TTGGAAGGTC TTCTCCATAA 1680 
CCCTATAACA TTAAAATACC ACCTTGTTGT CAGTGTAAAC AAGAGCATAG CCCAAAAGCA 1740 
CTGAGGCCAC TGACAACCCA TAGCCTTCCT ATCAAAAATC CTTAACTCTG CAGGTTTCCT 1800 
AACAGGGGAT CTAAATCTCA ACTAATCACC ATACAATGGT CCGACCAGAC CTAGGAGCGA 1860 
CTCCCCTCAG GACAGAAGGA TGGATGGTTC CTCCCAGGCC ATTAAGGGAA AGAGACACAA 1920 
TGGGTATTCA GTAAGTGATA AGGGAACTCT TGTAGAAGCA GTTAGGAAGA TTGCCTAATA 1980 
TTTGGTCTGC TCAAATGTGC CAGCTGTTTG CACTCAGCTA AACCTTAAAT TACTTACAGA 2040 
ATTAGGAAGG AGCCATCTAT ACCAATTCTG AGTTAATATG AGCTGAACAA GTTCTTATTA 2100 
ATAGCAAAGA ATCATTGAAA TCTCAAACTT GCAAAGTTTT CAACAAAAGT AAAGTTTGCT 2160 
GAAAGTTAGC AGTGTAACAT GTATTATCCT AACTTCTAAT CTTGTGGAAA TCAGACCCTA 2220 
TCAGTGCCCC TCAAAGCTGA AGTCCATCAG CATATGGCCA TACAACTAAT ACCCCTATTT 2280 
ATAGGGTTAG GAATGGCCAC TGCTACAGGA ATGGGAGTAA CAGGTTTATC TACTTCATTA 2340 
TCCTATTACC ACACACTCTT AAAGGATTTC TCAGACAGTT TACAAGAAAT AACAAAATCT 2400 
ATCCTTACTC TNTARTCCCA AAT AG RTTCT TTGGCAGCAG TGACTCTC 2448 

(2) INFORMATION FOR SEQ ID NO: 54: 

(L) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 21 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
CCTGAGTTCT TGCACTAACC C 21 

30 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 23 base pairs 

(B) TYPE: nucleotide 



10 
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(C) STRANDEDNESS: single 



(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 



5 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 



GTCCGTTGGG TTTCCTTACT CCT 



23 



(2) INFORMATION FOR SEQ ID NO: 56: 



(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 1196 base pairs 



15 



(B) TYPE: nucleotide 



(C) STRANDEDNESS: single 



(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 



TTCCTGAGTT 


CTTGCACTAA 


CCTCAAATGA 


GAGAAGTGCC 


GCCATAACTG 


CAACCCAAGA 


GTTTGGCGAT 


CCCTGGTATC 


TCAGTCAGGT 


CAATGACAGG 


ATGACAACAG 


AGGAAAGATA 


ATGATTCCCC 


ACAGGCCAGC 


AGGCAGTTCC 


CAGTGTAGAC 


CCTCATTAGG 


ACACAGAATC 


AGAACATGGA 


GATTGGTGCC 


GCAGACATTT 


GCTAACTTGC 


GTGCTAGAAG 


GACTAAGGAA 


AACTAGGAAG 


ATATGAATTA 


TTCAATGATG 


TCCACTATAA 


CACAGGGGAA 


AGGAAGAAAA 


TCCTACTGCC 


TTTCTGGAGA 


GACTAAGGGA 


GGCATTGAGG 


AAGCATACCA 


GGCAAGTGGA 


CATTGGAGGC 


TCTGGAAAAG 


GGAAAAGTTG 


GGAAAAGTAT 


ATGTCTAATA 


GGGCTTGCTT 


CCAGTGTGGT 


CTACAAGGAC 


ACTTTAAAAA 


AGATTGTCCA 


ATAGAAATAA 


GCCACCACCT 


CGTCCATGCC 


CCTTATGTCA 


AGGGAATCAC 


TGGAAGGCCC 


ACTGCCCCAG 


GGGATGAAGG 


TCCTCTGAGT 


CAGAAGCCAC 


TAACCAGATG 


ATCCAGCAGC 


AGGACTGAGG 


GTGCCCGGGG 


CAAGCGCCAG 


CCCATGCCAT 


CACCCTCACA 


GAGCCCCAGG 


TATGCTTGAC 


CATTGAGGGT 


CAGAAGGGTA 


CTGTCTCCTG 


GACACTGGCG 


GGCCTTCTCA 


GTCTTACTTT 


CCTGTCCTGG 


ACAACTGTCC 


TCCAGATCTG 


TCACTGTCCG 


AGGGGTCCTA 


GGACAGCCAG 


TCACTAGATA 


CTTCTCCCAG 


CCACTAAGTT 


GTGACTGGGG 


AACTTTACTC 


TTCCACATGC 


TTTTCTAATT 
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ATGCCTGAAA GCCCCACTCT CTTGTTAGGG 
TATACATGTG AATATAGGAG AAGGAACAAC 
TAATCCTGAA GTCCGGGCAA CAGAAGGACA 
TCAAGTTAAA CTAAAGGATT CCACCTCCTT 
5 CGAGACCCAA CAAGAACTCC AAAAGATTGT 
ACCAAGCAAT AGCCCTTGCA AGACTCCAAT 



152 

GAGAGACATT CTAGCAAAAG CAGGGGCCAT 900 

TGTTTGTTGT CCCCTGCTTG AGGAAGGAAT 960 

ATATGGACAA GCAAAGAATG CCCGTCCTGT 1020 

TCCCTACCAA AGGCAGTACC CCCTCAGACC 1080 

AAAGGACCTA AAAGCCCAAG GCCTAGTAAA 1140 

TTTAGGAGTA AGGAAACCCA ACGGAC 1196 



(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2391 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 



ATGATCCAGC 


AGCAGGACNG 


AGGGTGCCCG 


GGGCAAGCGC 


CAGCCCATGC 


CATCACCCTC 


60 


ACAGAGCCCC 


AGGTATGCTT 


GACCATTGAG 


GGTCAGAAGG 


GTNACTGTCT 


CCTGGACACT 


120 


GGCGGNGCCT 


TCTCAGTCTT 


ACTTTCCTGT 


CCTGGACAAC 


TGTCCTCCAG 


ATCTGTCACT 


180 


GTCCGAGGGG 


TCCTAGGACA 


GCCAGTCACT 


AGATACTTCT 


CCCAGCCACT 


AAGTTGTGAC 


240 


TGGGGAACTT 


TACTCTTCCC 


ACATGCTTTT 


CTAATTATGC 


CTGAAAGCCC 


CACTCTCTTG 


300 


TTGGGGAGAG 


ACATTCTAGC 


AAAAGCAGGG 


GCCATTATAC 


ATGTGAATAT 


AGGAGAAGGA 


360 


ACAACTGTTT 


GTTGTCCCCT 


GCTTGAGGAA 


GGAATTAATC 


CTGAAGTCCG 


GGCAACAGAA 


420 


GGACAATATG 


GACAAGCAAA 


GAATGCCCGT 


CCTGTTCAAG 


TTAAACTAAA 


GGATTCCACC 


480 


TCCTTTCCCT 


ACCAAAGGCA 


GTACCCCCTC 


AGACCCGAGA 


CCCAACAAGA 


ACTCCAAAAG 


540 


ATTGTAAAGG 


ACCTAAAAGC 


CCAAGGCCTA 


GTAAAACCAA 


GCAATAGCCC 


TTGCAAGACT 


600 


CCAATTTTAG 


GAGTAAGGAA 


ACCCAACGGA 


CAGTGGAGGT 


TAGTGCAAGA 


ACTCAGGATT 


660 


ATCAATGAGG 


CTGTTGTTCC 


TCTATACCCA 


GCTGTACCTA 


ACCCTTATAC 


AGTGCTTTCC 


720 


CAAATACCAG 


AGGAAGCAGA 


GTGGTTTACA 


GTCCTGGACC 


TTAAGGATGC 


CTTTTTCTGC 


780 


ATCCCTGTAC 


GTCCTGACTC 


TCAATTCTTG 


TTTGCCTTTG 


AAGATCCTTT 


GAACCCAACG 


840 


TCTCAACTCA 


CCTGGACTGT 


TTTACCCCAA 


GGGTTCAGGG 


ATAGCCCCCA 


TCTATTTGGC 


900 


CAGGCATTAG 


CCCAAGACTT 


GAGTCAATTC 


TCATACCTGG 


ACACTCTTGT 


CCTTCAGTAC 


960 
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ATGGATGATT TACTTTTAGT CGCCCGTTCA 
CTCTTAACTT TCCTCACTAC CTGTGGCTAC 
TCACAGGAGA TTAGATACTN AGGGCTAAAA 
GAACGTATCC AGCCTATACT GGCTTATCCT 
5 TTCCTTGGCA TAACAGGTTT CTGCCGAAAA 
CCATTATATA CACTAATTAN GGAAACTCAG 
CCTACAGAAG TGGCTTTCCA GGCCCTAAAG 
TTGCCAACAG GGCAAGATTT TTCTTTATAT 
GTCCTTACGC AGGTCTCAGG GATGAGCTTG 
GATGTAGTGG CAAAGGGTTG GCCTCATNGT 
GTATCTGAAG CAGTTAAAAT AATACAGGGA 
GTGAACGGCA TACTCACTGC TAAAGGAGAC 
TATCAGGCTC TATTACTTGA AGAGCCAGTG 
CCCAAACTTA TGCTGCCCAG AAGGATCTTT 
AACTATATAT ATACTGATGG AAGTTCGTTT 
NCCATAGGTG TTAGTGATAA AGCAGTACTT 
GCGCCCCCGT TAGCAGAACT AGTGGCACTG 
AGGAGGATAA ATGTGTATAC AGATAGCAAG 
GTTTATCTAA TCCGAAATGC CCATGTTGCA 
GGGGGAACCC CCATTAAATA CCACAAGTTA 
CTCAAGGAGG TGGAAGTCTT ACACTGCCAA 
CAGCATAAGT GGCTACAGAG GCAAGGAAAG 
GAAAGTCAGA GAGAGAGAGA GGAAGAGACA 
AGACAGAGAG TCAGAGAGAA GGAAAGAGAG 
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GAAACCTTGT 


GCCATCAAGC 


CACCCAAGAA 


1020 


AAGGTTTCCA 


AACCAAAGGC 


TCGGCTCTGC 

X WW w W A X w W 


1080 


TTATCCAAAG 

x x *» x w wii*vr»vj 


GCACCAGGGC 


CCTCAGTGAG 

w w x viiu x unu 


X X*#VJ 


CATCCCAAAA 

w*» x vvvnnnn 


CCCTAAAGCA 


ACTAAGAGGG 
**w x nnvnuuu 


1200 


CAGATTCCCA 


GGTACASCCC 

w w A nv^nw www 


AATAGCCAGA 
nn x nuuwnun 


1260 


AAAG CC AAT A 


CCTATTTAGT 


AAGATGGACA 


1320 

X J t. \J 


AAGGCCCTAA 


CCCAAGCCCC 
ww wnnu www w 


AGTGTTPACP 
nw lull wnu w 


13S0 


GCCACAGAAA 


AAACAGGAAT 


AGCTCTAGGA 

nvj w x w x nuun 


1440 


CAACCCGTGG 


TATACCTGAG 


T AAG G AAAT T 

x nnu vjiuui x x 


1500 


TTATRfiRTAA 


TGGNGGCAGT 


ACCAGTPTNA 
nu^nu x w x vtn 


1560 


AGAGATCTTN 


CTGTGTGGAC 

w x vj x w x uunv« 


ATCTCATGAT 

t\ X W X Wrt X W/\ X 


1620 


TTGTGGTTGT 


CAGACAACCA 


TTTACTTAAN 

xxx nw x X ruj i i 


1680 


CTGNGACTGC 


GCACTTGTGC 

uwnwx ivjiuu 


AAPTPTTAM 
rvtt w X W X X nnn 


1 740 


NTARARRTPP 


PPTTARPPAA 
ww x J. nuv<v«nn 


PPPTGAPPTP 

WWW X w w X w 


1 ftOO 
x ouu 


GT AC A A A AC£ 

w X *»w rVrV/lXlVJV? 


G ATT AC A A An 


GGNAGGATAT 


1 860 
X 0 ou 


fJAAAHTAAfZP 


W X W X X WWW WW 


cn a ci czcz a r* r* a 

w w/V»> w wn 


1 Q20 


ACCCCGCGAG 


CCTTAGAACT 


TTGGAAAGGG 


1980 


TATGCTTATC 


TAATCCGAAA 


TGCCCATGTT 


2040 


ATATGGAAAG 


AAAGGGAGTT 


CCTAACCTCT 


2100 


ATCATGGAGT 


TATTGCACAC 


AGTGCAAAAA 


2160 


AGCCATCAGA 


AAAGGGAAAG 


GGGAGAAGAG 


2220 


ACTAGCAGAA 


AGGAAAGAGA 


GAAAGAGACA 


2280 


GAGCACAAAG 


AGGGAGTCAG 


AGAGAGAGAG 


2340 


AGAGGAAGAG 


ACAAAGAATG 


A 


2391 



(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1722 base pairs 

( B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 





TGGAGAATAG 


CAGCATAAGT 


TG G CTGGC AG 


AAGTAGGGAA 


AG AC AG C AAG 


AAGTAAAGAA 
nnvj X nnAunn 






AAAAARGAGA 

nnnnnnu nun 


AAGTCAGAGA 


AAGAAAAAAA 


GAGAGGAAGA 


AA C AAAG AAG 

nn v* rwn vj n/iw 


A APTTP A AP A 


120 

X^ vJ 


5 


GAGAAAGAAG 


T AG T AAAG AA 

x nvj x iwi vnn 


AAAACAGTAT 


ACCCTATTCC 


TTTAAAAG CC 

x x x ruwivj w v* 


AGGGTAAATT 
nvjvjvj x nnn x x 


180 




TCTGTCTACC 


TAG C C AAGG C 


ATATTCTTCT 
n x n x x v^ x x x 


TATGTGGAAC 


ATCAACCTAT 

*» x ^>nn^v x n x 


ATPTGPPTPP 

I & X W X VJVjrV*' l^W 


240 




CCACTAACTG 


GACAGGCACC 


TGAACCTTAG 


TCTTTCTAAG 


TCCCAACATT 


AACATTGCCC 


300 




CAGGAAATCA 


GACCCTATTG 


GTACCTGTCA 


AAGCTAAAGT 


CCCGTCAGTG 


CAGAGCCATA 


360 




CAACTAATAT 


CCCTATTTAT 

X *» XXX *» X 


AGGGTTAGGA 


ATGGCTACTG 


CTACAGGAAC 


TGGAATAGCC 


420 


10 


GGTTTATCTA 


PTTPATTATP 

v* x x w n x x i* x w 


CTACTACCAT 

w x n v** x nwn x 


ACACTCTCAA 


AG AATTTPTP 

nunn x x x w x w 


AGACAGTTTG 
nvmwnu XXX vj 


480 




PA AG A A AT A A 


TGAAATPTAT 


TPTT APTTT A 
x \* x x x x x n 


PAATPPPAAT 


TAGAPTPTTT 
x nuAo x v^ x x x 


GGPAGPA ATG 

VJVJ V^tiVJ Vf>(ui X VJ 


3 *« w 




AfTPTrf AAA 


APPGPPGAGG 


PPPAPAPPTP 


PTPAPTGPTG 
v< x \*nv* x uv* x vj 


AG A A AGG AGG 

nVjnnnVJVJrtVJ Vj 


A PTPTGP A PP 
nv- X v— X v»v.nv«v» 


Ann 




x x v.* x x nvjvj uu 


AAG AG TGTTG 


TTTTT AP APT 

X X X X X rivnV- X 


AACCAGTCAG 
nnv^^nu x v-»nvj 


GGATAGTACG 

uun x nu x nou 


AGATGPPAPP 
nun x vj ^»nu v-» 


ODU 




TGGCATTTAC 


AGG AAAGGG C 


TTPTG AT ATP 
x x w x un ini v 


AGACAATGCC 

nvjnv*nn X vjv^v«» 


TTTPAAACTC 

XXX \»J^nnV* X w 


TTATACPAAP 
x xnxnv^v^nnV* 


720 

/ x, U 


15 


PTPTGGAGTT 

vi vi vj vjnVj X X 


GGGPAAPATG 


GPTTPTTPPA 

VJV^X X wX X 


TTTPTAGGTP 

X X X \+ X nvjw X V* 


PPATGGPAGP 


P A TPTTP PTP 
\>«nX x vj v-» x vj 






X in^/ X UnL^ X 


TTGGGPPPTG 


T ATTTTT A AG 
1 r\ X X X X X fin Vj 


PTTPTTPTP A 

Vrf X X Va> X X VJ X 


A A TTTGTTTP 


PTPTAGGATP 
x \- X nVjvjn X v» 






PAAPPPATPA 


APPTAPAPAT 
nut x ntnvn 1 


GGTPTTAPA A 
vj*j rVV»»/Vrt 


ATPPA APPPP 
nX vjvjnrYv«.v^v^v«. 


A A ATGAGTTP 
rV/Vn i Vj/AVj XXV- 


A APT A APA AP 






TTPTAPPA AP 


PAPPPPTPPA 


APP ATPPAPT 


PPPAPTTPPA 


PTAGPPTAG A 


P A TTPf PPTP 
vjnX X L^Lv X l« 






TfinAAnapAf 

X uunnunUnL 


TAP A APTPPA 


PPPPPPPTTP 


rrtrpfji/^ f** {** R 

X JL lututtln 


TPPAPPAPPA 
1 v^v-nvjv-nvjvjn 


APTAPPTAP A 
nvj X nvj v-r i non 


1 POP 


20 


PPPPTPATPP 
vj Www X \~t\ X V-vj 


PPPAA ATTPP 


PA AP AGP APT 


TGGPPTPTPP 

X VJVjljVJ X VJ X V-rV-» 


TGTTTAGAGG 
x vj X x x nvjnvj vj 


PPPPATTPA A 
vjvjvjvjn X 1 «nn 






GAGGTGAPAG 


PPTGPTGGPA 


GCPTCACAGC 

VJ V»» V/ X ^rtVyAVJV* 


CCTCGTTGGY 

»-»V-» X V»U X X VjVJ X 


TPTP AGTGPP 
x v» x x v> v-r v- 


TCCTCAGCPT 

x v-» w x L.nuv.u x 


1140 




TGGTGCCCAC 


TCTGGCCGTG 


CTTGAGGAGC 


CCTTCAGCCT 


GCCACTGCAC 


TGTGGGAGCC 


1200 




TCTTTCTGGG 


CTGGACAAGG 


CCGGAGCCAG 


CTCCCTCAGC 


TTGCAGGGAG 


GTATGGAGGG 


1260 




AGAGATGCAG 


GCGGGAACCA 


GGGCTGCGCA 


TGGCGCTTGC 


GGGCCAGCAT 


GAGTTCCAGG 


1320 


25 


TGGGCGTGGG 


CTCGGCGGGC 


CCCACACTCG 


GGCAGTGAGG 


GGCTTAGCAC 


CTGGGCCAGA 


1380 




CAGATGCTGT 


GCTCAACTTC 


TTCGCTGGGC 


CTTAGCTGCC 


TTCCCCGTGG 


GGCAGGGCTY 


1440 




CGGGAACMTG 


CAGCCTGCCC 


ATGCTTGAGC 


CCCCCACCCC 


GCCGTGGGTT 


CYTGCACAGC 


1500 




CCAAGCTTCC 


CGGACAAGCA 


CCACCCCTTA 


TCCACGGTGC 


CCAGTCCCAT 


CAACCACCCA 


1560 




AGGGTTGAGG 


AGTGCGGGCA 


CACAGCGCGG 


GATTGGCAGG 


CAGTTCCACT 


TGCGGCCTTG 


1620 


30 


GTGCGGGATC 


CACTGCGTGA 


AGCCAGCTGG 


GCTCCTGAGT 


CTGGTGGGGA 


CTTGGAGAAT 


1680 




CTTTATGTCT 


AGCTAAGGGA 


TTGTAAATAC 


ACCAATCAGC 


AC 




1722 



(2) INFORMATION FOR SEQ ID NO: 59: 

35 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 495 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

10 CTTCCCCAAC TAATAAGGAC CCCCCTTTCA ACCCAAACAG TCCAAAAGGA CATAGACAAA 60 

GGAGTAAACA ATGAACCAAA GAGTGCCAAT ATTCCCTGGT TATGCACCCT CCAAGCGGTG 120 

GGAGAAGAAT TCGGCCCAGC CAGAGTGCAT GTACCTTTTT CTCTCTCACA CTTGAAGCAA 180 

ATTAAAATAG ACNTAGGTNA ATTNTCAGAT AGCCCTGATG GYTATATTGA TGTTTTACAA 240 

GGATTAGGAC AATCCTTTGA TCTGACATGG AGAGATATAA TATTACTGCT AAATCAGACG 300 

15 CTAACCTCAA ATGAGAGAAG TGCTGCCATA ACTGGAGCCC GAGAGTTTGG CAATCTCTGG 360 

TATCTCAGTC AGGTCAATGA TAGGATGACA ACGGAGGAAA GAGAACGATT CCCCACAGGG 420 

CAGCAGGCAG TTCCCAGTGT AGCTCCTCAT TGGGACACAG AATCAGAACA TGGAGATTGG 480 
TGCCGCAGAC ATTTA 495 

20 

(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2503 base pairs 

25 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

CCAAGAACCC ACCAATTCCG GANCACATTT TGGCGACCAC GAAGGGACTT TCGCATATCG 60 

CCAAGCGGTG AGACAATAGC CGAGCGGTGA GACCTTTCCC AATCGCCAAG CAGTGAGTAC 120 

35 CATCAGACCC CTTTCACTTG CTATTCTGTC CTATCTTTCT TTAGAATTCG GGGGCTAAAT 180 

ACCGGGCATC TGTCAGCCAT TTAAAAGTGA CTAGCGGGCC GCCGGACTAA AGACACGGGT 240 
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GTCAAGCTTT CTGGGAAAGG GCTCTCTAAC 
GGTTTGCCTA GAACCAGCTT CCGCTTTTCC 
GTGAAGGAAA GCCATGCATC TCCGGGGTCT 
GAGTGGAACT CTCAAAAGCA TGTCGCCCAA 
TGACCCTTGC CCTCTGGGTC CTAATGCCTG 
TGAAGCTAGA ACCGCTTCTA AAAATTGCTA 
CTATAAAGAA TGAWTTCTAG TATTAAACTC 
GGCTCACCAA TCAGAAAGAC ACAGTTTTTG 
GGAATTTTAG GATCCCTCCT CAGACTAACA 
ATATGGGGAG CCTCAGAAAT TGTATCCCTC 
ACTCTTCCAA CCCTGAAGAT CCCCTCCCTC 
GTGGCATAAC ATCTTTATAG GATGGGGTAA 
ACTCTAACAG GTTTTTGAGA ATGCGTCAGT 
GGTCCTCCTT GTGGTCTAGG AGGACAGGCA 
TAAGGACCAC TAAATCCGAC CTTCCTCGGT 
TTTCTGCTGC TGCGTCGGTG AGCGCAACTA 
AGGTTCTTGG GCAGGGGTTG TTTCTGCTGC 
CAGGGTCCCA GGACCATTGC AGGTCCTTGG 
GTGGGCGGTT TTGTCTTTCA TATGGGAAAC 
TGCATCCTAA GCCATTGGGA CCAATTTGAC 
TTTTCCTGCA CTACGGCTTG GCCCCAATAT 
GAGGGAAGCA CAAATTACAA TAYTATCCTA 
AAATGGAGTG AATACCTTAT GTCCAAGCTT 
GCAAAGCTTG CAATTTACAT CCCACAGGAG 
TCCCTATAGC TTCCCTTCCT ATTGATGATA 
AAATAAGCAA AGAAATCTCC AAAGGTCCAC 
TCAAGYTGTA GGGGGAGGGG AATTTGGCCC 
GATTTAAAGC AGATCAAGGC AGACCTGGGG 
GATGTCCTAC AGGGTCTAGG GCAAACCTTT 
TTAGATCAAA CCCTGGCCTT TAATGAAAAG 
GGAGATACCT GGTATCCTAG TCAAGTAAAT 
TTCCTTACTG GTCAGCAACC CATCCCCAGT 
CATGGGGACT GGAGTCGTAA ACATCTGTTG 
GGGAAAAAGC CCATGAATTA TTCAATGATA 
CCTTCTGCCT TCCTCGAGCG GCTACAAGAG 
GAATCACTCG AGGGTCAATT GATTCTAAAA 
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GACCTCACTT 


GGAGAGACGT 


CATGCTACTG 


1980 


AATGCGGCTT 


TAGCTGCAGC 


CTGAGAGTTT 


2040 


GAAAGAATGA 


CAGCCGAAGA 


AAGGGACAAC 


2100 


ATGGATCCCC 


ACTGGGACTT 


TGACTCAGAT 


2160 


ATCTGTGTTC 


TGGAAGGACT 


AAGGAGAATT 


2220 


TCCACCATAA 


CCCAGGGAAA 


GGAAGAAAAT 


2280 


GCCTTAAGAA 


AATATACTCC 


CCTGTCACCC 


2340 


GATAAGTTTA 


TTACCCAATC 


AGCCACAGAT 


2400 
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10 



15 



20 



25 



30 



ATCAGGAGAA AGCTCCAAAA GCAAGCCCTG AGCCTGAACA AAATCTAGAG ACATTATTAA 2460 



(2) INFORMATION FOR SEQ ID NO; 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1167 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 



AAGGAAACTC 


AGAAAGCCAA 


TACCCATTTA 


GTAAGATGGA 


CACCAGAAGC 


AGAAGCAGCT 


60 


TTCCAGGCCC 


TAAAGAAATC 


CCTAACCCAA 


GCCCCAGTGT 


TAAGCTTGCC 


AACGGGGCAA 


120 


GACTTTTCTT 


TATATGTCAC 


AGAAAAACAG 


GAATAGCTCT 


AGGAGTCCTT 


ACACAGGTCC 


180 


AAGGGACAAG 


CTTGCAACCT 


GTGGCATACC 


TGAGTAAGGA 


AACTGATGTA 


NTGGCAAAGG 


240 


GTTGGCCTCA 


TTGTTTACAG 


GTAGGGCAGC 


AGTAGCAGTC 


TTAGTTTCTG 


AAACAGTTAA 


300 


AATAATACAG 


GGAAGAGATC 


TTACTGTGTG 


GACATCTCAT 


GATGTGAACG 


GCATACTCAC 


360 


TGCTAAAGAG 


GACTTGTGGC 


TGTCAGACAA 


CCATTTACTT 


AAATAGCAGG 


TTCTATTACT 


420 


TGAAGTGCCA 


GTGCTGCGAC 


TGCACATTTG 


TGCAACTCTT 


AACCCAGCCA 


CATTTCTTCC 


480 


AGACAATGAA 


GAAAAGATAG 


AACATAACTG 


TCAACAAGTA 


ATTGCTCAAA 


CCTATGCTGC 


540 


TCGAGGGGAC 


CTTCTAGAGG 


TTCCCTTGAC 


TGATCCCGAC 


CTCAACTTGT 


ATACTGATGG 


600 


AAGTTCCTTG 


GCAGAAAAAG 


GACTTTGAAA 


AGCGGGGTAT 


GCAGTGATCA 


GTGATAATGG 


660 


AATACTTGAA 


AGTAATCGCC 


TCACTCCAGG 


AACTAGTGCT 


CACCTGGCAG 


AACTAATAGC 


720 


CCTCACTTGG 


GCACTAGAAT 


TAGGAGAAGG 


AAAAAGGGTA 


AATATATATT 


CAGACTCTAA 


780 


GTATGCTTAC 


CTAGTCCTCC 


ATGCCCATGC 


AGCAATATGG 


AGAGAGAGGG 


AATTCCTAAC 


840 


TTCTGAGGGA 


ACACCTATCA 


ACCATCAGGG 


AAGCCATTAG 


GAGATTATTA 


TTGGCTGTAC 


900 


AGAAACCTAA 


AGAGGTGGCA 


GTCTTACACT 


GCCAGGGTCA 


TCAGGAAGAA 


GAGGAAAGGG 


960 


AAATAGAAGG 


CAATCGCCAA 


GCGGATATTG 


AAGCAAAAAA 


AGCCGCAAGG 


CAGGACTCTC 


1020 


CATTAGAAAT 


GCTTATAGAA 


GGACCCCTAG 


TATGGGGTAA 


TCCCCTCTGG 


GAAACCAAGC 


1080 


CCCAGTACTC 


AGCAGGAAAA 


ATAGAATAGG 


AAACCTCACA 


AGGACATACT 


TTCCTCCCCT 


1140 


CCAGATGGCT 


AGCCACTGAG 


GAAGGAA 








1167 



ACCTGGCAAC CTTGGTGTTC TATAATAGGG ACCAAGAGGA ACA 



2503 
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(2) INFORMATION FOR SEQ ID NO: 62: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 78 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

10 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

15 TCCAAAGGCA CCAGGGCCCT CAGTGAGGAA CGTATCCAGC CTATACTGGC TTATCCTCAT 60 
CCCAAAACCC TAAAGCAA 78 



(2) INFORMATION FOR SEQ ID NO: 63 

20 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 26 amino acids 

( B ) TYPE : amino acid 

25 (ii) MOLECULE TYPE : peptide 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 63 

Ser Lys Gly Thr Arg Ala Leu Ser Glu Glu Arg lie Gin Pro lie Leu 
30 1 5 10 15 

Ala Tyr Pro His Pro Lys Thr Leu Lys Gin 
20 25 



35 (2) INFORMATION FOR SEQ ID NO: 64: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

10 

AAATGTCTGC GGCACCAATC TCCATGTT 28 



(2) INFORMATION FOR SEQ ID NO: 65: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

25 

AAGGGGCATG GACGAGGTGG TGGCTTATTT 30 



(2) INFORMATION FOR SEQ ID NO: 66: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
GGAGAAGAGC AGCATAAGTG G 21 

5 



(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 
GTGCTGATTG GTGTATTTAC AATCC 25 

20 

(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

25 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 
GACTCGCTGC AGATCGATTT TTTTTTTTTT TTTT 34 



35 (2) INFORMATION FOR SEQ ID NO: 69: 



5/9/2006, EAST Version: 2.0.3.0 



WO 98/23755 



PCT/IB97/01482 



161 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

10 

GCCATCAAGC CACCCAAGAA CTCTTAACTT 30 



(2) INFORMATION FOR SEQ ID NO: 70: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 
25 CCAATAGCCA GACCATTATA TACACTAATT 30 



(2) INFORMATION FOR SEQ ID NO: 71: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 

(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 
GCCATAACTG CAACCCAAGA GTT 23 

5 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 
10 (B ) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



15 



(ii) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 
GGACGAGGTG GTGGCTTATT TCT 23 



20 

(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 
25 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



30 



(ii) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 
AACTTGCGTG CTAGAAGGAC TAAGG 25 



35 

(2) INFORMATION FOR SEQ ID NO: 74: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleotide 

5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

AACTTTTCCC TTTTCCAGAT CCTC 24 



15 (2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleotide 

20 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

GCATACCAGG CAAGTGGACA TT 22 



30 (2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

5 

CTGTCCGTTG GGTTTCCTTA CTCCT 25 
(2) INFORMATION FOR SEQ ID NO: 77: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
15 <D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

20 

GAGGCTCTGG AAAAGGGAAA AGTT 24 
(2) INFORMATION FOR SEQ ID NO: 78: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
30 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

35 

CTGTCCGTTG GGTTTCCTTA CTCCT 25 
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(2) INFORMATION FOR SEQ ID NO: 79: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

10 

(ii) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

15 AGGAGTAAGG AAACCCAACG GACAG 25 

(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 
TGTATATAAT GGTCTGGCTA TTGGG 25 

30 

(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 
AGGAGTAAGG AAACCCAACG GACAG 25 

10 

(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

15 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 
TTCGGCAGAA ACCTGTTATG CCAAGG 26 

25 

(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

30 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 
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CTCGATTTCT TGCTGGGCCT TA 22 

5 (2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleotide 

10 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 

GTTGATTCCC TCCTCAAGCA 20 

20 (2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleotide 

25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 

CTCTACCAAT CAGCATGTGG 20 

35 (2) INFORMATION FOR SEQ ID NO: 86: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 



10 



TGTTCCTCTT GGTCCCTAT 19 



(2) INFORMATION FOR SEQ ID NO: 87: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 433 aminoacids 

(B) TYPE: aminoacid 

20 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

Met Ala Thr Ala Thr Gly Thr Gly lie Ala Gly Leu Ser Thr Ser Leu 
15 10 15 

25 Ser Tyr Tyr His Thr Leu Ser Lys Asn Phe Ser Asp Ser Leu Gin Glu 

20 25 30 

lie Met Lys Ser lie Leu Thr Leu Gin Ser Gin Leu Asp Ser Leu Ala 

35 40 45 

Ala Met Thr Leu Gin Asn Arg Arg Gly Pro His Leu Leu Thr Ala Glu 
30 50 55 60 

Lys Gly Gly Leu Cys Thr Phe Leu Gly Glu Glu Cys Cys Phe Tyr Thr 
65 70 75 80 

Asn Gin Ser Gly He Val Arg Asp Ala Thr Trp His Leu Gin Glu Arg 
85 90 95 

35 Ala Ser Asp He Arg Gin Cys Leu Ser Asn Ser Tyr Thr Asn Leu Trp 

100 105 HO 
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Ser Trp Ala Thr Trp Leu Leu Pro Phe Leu Gly Pro Met Ala Ala lie 

115 120 125 

Leu Leu Leu Leu Thr Phe Gly Pro Cys lie Phe Lys Leu Leu Val Lys 
130 135 140 

5 Phe Val Ser Ser Arg lie Glu Ala lie Lys Leu Gin Met Val Leu Gin 

145 150 155 160 

Met Glu Pro Gin Met Ser Ser Thr Asn Asn Phe Tyr Gin Gly Pro Leu 

165 170 175 

Glu Arg Ser Thr Gly Thr Ser Thr Ser Leu Glu He Pro Leu Trp Lys 
10 180 185 190 

Thr Leu Gin Leu Gin Gly Pro Phe Phe Ala Pro He Gin Gin Glu Val 

195 200 205 

Ala Arg Ala Val He Gly Gin He Pro Asn Ser Ser Trp Gly Val Leu 
210 215 220 

15 Phe Arg Gly Gly He Glu Glu Val Thr Ala Cys Trp Gin Pro His Ser 

225 230 235 240 

Pro Arg Trp Xaa Ser Val Pro Pro Gin Pro Trp Cys Pro Leu Trp Pro 

245 250 255 

Cys Leu Arg Ser Pro Ser Ala Cys His Cys Thr Val Gly Ala Ser Phe 
20 260 265 270 

Trp Ala Gly Gin Gly Arg Ser Gin Leu Pro Gin Leu Ala Gly Arg Tyr 

275 280 285 

Gly Gly Arg Asp Ala Gly Gly Asn Gin Gly Cys Ala Trp Arg Leu Arg 
290 295 300 

25 Ala Ser Met Ser Ser Arg Trp Ala Trp Ala Arg Arg Ala Pro His Ser 

305 310 315 320 

Gly Ser Glu Gly Leu Ser Thr Trp Ala Arg Gin Met Leu Cys Ser Thr 

325 330 335 

Ser Ser Leu Gly Leu Ser Cys Leu Pro Arg Gly Ala Gly Leu Arg Glu 
30 340 345 350 

Xaa Ala Ala Cys Pro Cys Leu Ser Pro Pro Pro Arg Arg Gly Phe Leu 

355 360 365 

His Ser Pro Ser Phe Pro Asp Lys His His Pro Leu Ser Thr Val Pro 
370 375 380 

35 Ser Pro He Asn His Pro Arg Val Glu Glu Cys Gly His Thr Ala Arg 

385 390 395 400 
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Asp Trp Gin Ala Val Pro Leu Ala Ala Leu Val Arg Asp Pro Leu Arg 

405 410 415 

Glu Ala Ser Trp Ala Pro Glu Ser Gly Gly Asp Leu Glu Asn Leu Tyr 
420 425 430 

Val 
433 



(2) INFORMATION FOR SEQ ID NO: 88: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 693 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
15 (D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

20 



CTTCCCCAAC 


TAATAAGGAC 


CCCCCTTTCA 


ACCCAAACAG 


TCCAAAAGGA 


CATAGACAAA 


60 


GGAGTAAACA 


ATGAACCAAA 


GAGTGCCAAT 


ATTCCCTGGT 


TATGCACCCT 


CCAAGCGGTG 


120 


GGAGAAGAAT 


TCGGCCCAGC 


CAGAGTGCAT 


GTACCTTTTT 


CTCTCTCACA 


CTTGAAGCAA 


180 


ATTAAAATAG 


ACNTAGGTNA 


ATTNTCAGAT 


AGCCCTGATG 


GYTATATTGA 


TGTTTTACAA 


240 


GGATTAGGAC 


AATCCTTTGA 


TCTGACATGG 


AGAGATATAA 


TATTACTGCT 


AAATCAGACG 


300 


CTAACCTCAA 


ATGAGAGAAG 


TGCTGCCATA 


ACTGGAGCCC 


GAGAGTTTGG 


CAATCTCTGG 


360 


TATCTCAGTC 


AGGTCAATGA 


TAGGATGACA 


ACGGAGGAAA 


GAGAACGATT 


CCCCACAGGG 


420 


CAGCAGGCAG 


TTCCCAGTGT 


AGCTCCTCAT 


TGGGACACAG 


AATCAGAACA 


TGGAGATTGG 


480 


TGCCGCAGAC 


ATTTACTAAC 


TTGCGTGCTA 


GAAGGACTAA 


GGAAAACTAG 


GAAGACTATG 


540 


AATTATTCAA 


TGATGTCCAC 


TATAACACAG 


GGGAAAGGAA 


GAAAATCCTA 


CTGCCTTTCT 


600 


GGAGAGACTA 


AGGGAGGCAT 


TGAGGAAGCA 


TACCAGGCAA 


GTGGACATTG 


GAGGCTCTGG 


660 


AAAAGGGAAA 


AGTTGGGCAA 


ATTGAATGCC 


TAA 






693 



35 (2) INFORMATION FOR SEQ ID NO: 89: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1577 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

10 





MAL 1 1 bUb 1 O 


C 1 A^AAWuAL 


Taarr'aaaar* 
1 AAbOAAAAt 


1 AbbAA(jAU I 


ATGAATTATT 


f»ft ft ft rp/-» m/-T 

CAA iGATGTC 


oO 




paPTaTaapa 
UnL 1 A 1 AALA 


p a orr'P a a a rs 

CACCCCAAAC 


paapaaaaTP 

bAAbnAAA 1 C 


p^pa p* i'^^ ^^^"T 1 r p 


tptpp apap a 

1U1 bbAbAVjA 


r ,r Pftft^ , r , oft/^/^ 
O 1 AAbljbAbb 


i oa 




paTTpappa a 

CA1 KjAVjOAA 


OCA 1 ACC ACC 


pa aPTPPapa 

C AAC 1 CO ALA 


ttpp arrpTP 

1 1 OOAOCC 1 C 


ttt' ft ft tv a cr*r 
1 OOAAAAOOO 


ft ft ft ft PTTPPP 
AAAAb 1 1 COO 


i oa 




tAAAi lbAAl 


uLL X AA 1 Auu 


PPTTfPTTPP 
1 1 1 1 LL 


UPTPPftPTPT 
ALi 1 LiC AO 1 CI 


ftpft appb r , r x r > 
ALAAViUALVjL 


rnipw ft r» ft ft ft ft r« 
I 1 iAGAAAAG 


Oyf A 




ivmminfnppji 

nllul LtAAb 


1 AC AAA 1 Anu 


CCCCCCC 1 Lu 


iCCAiCCCCC 


1 1 A 1 O 1 tAAu 


ft RTPRPTC 

ubAA 1 UAt 1 O 


JUL) 




i** a & p p p p t a p 


ICCCCCAOOC 


papp aappTP 

uAUuAAub 1 C 


ptptp apTpa 


p a a p p p a pt a 


a pptp a tp a t 

MUv 1 O A 1 uA 1 






pp appa^'far* 
tLAU C AC L-AIj 


PSPT^JiPPPT 
«AL 1 bAbbU 1 


CCCCCOCOCA 


apTpppappp 

AO 1 OCCACCC 


Pft TPPPl TP ft 
LAlOt^AltA 


PPPTP ft C ft P /■* 

Cue 1 LAGAGC 








111 OAOLA 1 1 


papapppapp 


aaPTTaapTp 

rtnu 1 1 nnL 1 Vj 


TPTPPTPPaP 
1U1 Lt 1 Aw 


apTpppppap 


Ann 




CCTTCTCAGT 


CTTACTTTCC 


TGTCCCAGAC 


AATTGTCCTC 


CAGATCTGTC 


ACTATCCGAG 


540 


20 


GGGTCCTAAG 


ACAGCCAGTC 


ACTACATACT 


TCTCTCAGCC 


ACTAAGTTGT 


GACTGGGGAA 


600 




CTTTACTCTT 


TTCACATGCT 


TTTCTAATTA 


TGCCTGAAAG 


CCCCACTCCC 


TTGTTAGGGA 


660 




GAGACATTTT 


AGCAAAAGCA 


GGGGCCATTA 


TACACCTGAA 


CATAGGAAAA 


GGAATACCCA 


720 




TTTGCTGTCC 


CCTGCTTGAG 


GAAGGAATTA 


ATCCTGAAGT 


CTGGGCAATA 


GAAGGACAAT 


780 




ATGGACAAGC 


AAAGAATGCC 


CGTCCTGTTC 


AAGTTAAACT 


AAAGGATTCT 


GCCTCCTTTC 


840 


25 


CCTACCAAAG 


GAAGTACCCT 


CTTAGACCCG 


AGGCCCTACA 


AGGACTCAAA 


AGATTGTTAA 


900 




GGACCTAAAA 


GCCCAAGGCC 


TAGTAAAACC 


ATGCAGTAGC 


CCCTGCAATA 


CTCCAATTTT 


960 




AGGAGTAAGG 


AAACCCAACG 


GACAGTGGAG 


GTTAGTGCAA 


GATCTCAGGA 


TTATTAATGA 


1020 




GG CTGTTTTT 


CCTCTATACC 


CAGCTGTATC 


TAGCCCTTAT 


ACTCTGCTTT 


CCCTAATACC 


1080 




AGAGGAAGCA 


GAGTAGTTTA 


CAGTCCTGGA 


CCTTAAGGAT 


GCCTCTTTCT 


GCATCCCTGT 


1140 


30 


ACATCCTGAT 


TCTCAATTCT 


TGTTTGTCTT 


TGAAGATCCT 


TTGAACCCAA 


TGTCTCAATT 


1200 




CACCTGGACT 


GTTTTACCCC 


AGGGGTTCCG 


GGATAGCCCC 


CATCTATTTG 


GCCAGGCATT 


1260 




AGCCCAAGAC 


TTGAGCCAAT 


TCTCATACCT 


GGACATCTTG 


TCCTTCGGTA 


TGGGATGATT 


1320 




TAATTTTAGC 


CACCCGTTCA 


GAAACCTTGT 


GCCATCAAGC 


CACCCAAGCG 


TTCTTAAATT 


1380 




TCCTCACTCC 


GTGTGGCTAC 


AAGGTTTCCA 


AACCAAAGGC 


TCAGCTCTGC 


TCACAGCAGG 


1440 


35 


TTAAATACTT 


AGGGTTAAAA 


TTATCCAAAG 


GCACCAGGGC 


CCTCTGTGAG 


GAATGTATCC 


1500 




AACCTGTACT 


GGCTTATCTT 


CATCCCAAAA 


CCCTAAAGCA 


ACTAAGAAGG 


TCCTTGGCAT 


1560 
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AACAGGTTTC TGCCGAA 1577 



(2) INFORMATION FOR SEQ ID NO: 90: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 182 amino acids 

(B) TYPE: amino acid 

10 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

Ser Ser Ser Arg Thr Glu Gly Ala Arg Gly Lys Cys Gin Pro Met Pro 
15 1 5 10 15 

Ser Pro Ser Glu Pro Arg Val Cys Leu Thr lie Glu Ser Gin Glu Val 

20 25 30 

Asn Cys Leu Leu Asp Thr Gly Ala Ala Phe Ser Val Leu Leu Ser Cys 
35 40 45 

20 Pro Arg Gin Leu Ser Ser Arg Ser Val Thr lie Arg Gly Val Leu Arg 

50 55 60 

Gin Pro Val Thr Thr Tyr Phe Ser Gin Pro Leu Ser Cys Asp Trp Gly 
65 70 75 80 

Thr Leu Leu Phe Ser His Ala Phe Leu lie Met Pro Glu Ser Pro Thr 
25 85 90 95 

Pro Leu Leu Gly Arg Asp lie Leu Ala Lys Ala Gly Ala lie lie His 

100 105 110 

Leu Asn lie Gly Lys Gly lie Pro lie Cys Cys Pro Leu Leu Glu Glu 
115 120 125 

30 Gly He Asn Pro Glu Val Trp Ala He Glu Gly Gin Tyr Gly Gin Ala 

130 135 140 

Lys Asn Ala Arg Pro Val Gin Val Lys Leu Lys Asp Ser Ala Ser Phe 
145 150 155 160 

Pro Tyr Gin Arg Lys Tyr Pro Leu Arg Pro Glu Ala Leu Gin Gly Leu 
35 165 170 175 

Lys Arg Leu Leu Arg Thr 
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180 



(2) INFORMATION FOR SEQ ID NO: 91: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 



15 



AGATCTGCAG AATTCGATAT CACCCCCCCC CCCCCC 36 



(2) INFORMATION FOR SEQ ID NO: 92: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

30 

AGATCTGCAG AATTCGATAT CA 22 



(2) INFORMATION FOR SEQ ID NO: 93: 
(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 2304 base pairs 

(B) TYPE: nucleotide 
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(C) STRANDEDNESS: single 

( D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 



5 


TCCAGCAGCA 


GGACTGAGGG 


TGCCCGGGGC 


AAGTGCCAGC 


CCATGCCATC 


50 




ACCCTCAGAG 


CCCCGGGTAT 


GTTTGACCAT 


TGAGAGCCAG 


GAAGTTAACT 


100 




GTCTCCTGGA 


CACTGGCGCA 


GCCTTCTCAG 


TCTTACTTTC 


CTGTCCCAGA 


150 




CAATTGTCCT 


CCAGATCTGT 


CACTATCCGA 


GGGGTCCTAG 


GACAGCCAGT 


200 




CACTACATAC 


TTCTCTCAGC 


CACTAAGTTG 


TGACTGGGGA 


ACTTTACTCT 


250 


10 


TTTCACATGC 


TTTTCTAATT 


ATGCCTGAAA 


GCCCCACTCC 


CTTGTTAGGG 


300 




AGAGACATTT 


TAGCAAAAGC 


AGGGGCCATT 


ATACACCTGA 


ACATAGGAAA 


350 




AGGAATACCC 


ATTTGCTGTC 


CCCTGCTTGA 


GGAAGGAATT 


AATCCTGAAG 


400 




TCTGGGCAAT 


AGAAGGACAA 


TATGGACAAG 


CAAAGAATGC 


CCGTCCTGTT 


450 




CAAGTTAAAC 


TAAAGGATTC 


TGCCTCCTTT 


CCCTACCAAA 


GGAAGTACCC 


500 


15 


TCTTAGACCC 


GAGGCCCTAC 


AAGGANCTCA 


AAAGATTGTT 


AAGGACCTAA 


550 




AAGCCCAAGG 


CCTAGTAAAA 


CCATGCAGTA 


GCCCCTGCAA 


TACTCCAATT 


600 




TTAGGAGTAA 


GGAAACCCAA 


CGGACAGTGG 


AGGTTAGTGC 


AAGATCTCAG 


650 




GATTATTAAT 


GAGGCTGTTT 


TTCCTCTATA 


CCCAGCTGTA 


TCTAGCCCTT 


700 




ATACTCTGCT 


TTCCCTAATA 


CCAGAGGAAG 


CAGAGTGGTT 


TACAGTCCTG 


750 


20 


GACCTTAAGG 


ATGCCTTTTT 


CTGCATCCCT 


GTACGTCCTG 


ACTCTCAATT 


800 




CTTGTTTGCC 


TTTGAAGATC 


CTTTGAACCC 


AACGTCTCAA 


CTCACCTGGA 


850 




CTGTTTTACC 


CCAAGGGTTC 


AGGGATAGCC 


CCCATCTATT 


TGGCCAGGCA 


900 




TTAGCCCAAG 


ACTTGAGTCA 


ATTCTCATAC 


CTGGACACTC 


TTGTCCTTCA 


950 




GTACGTGGAT 


GATTTACTTT 


TAGTCGCCCG 


TTCAGAAACC 


TTGTGCCATC 


1000 


<* — ' 


A AG C F A PfP A 


AGA APTPTTA 


APTTTPCTCA 
x x x v^v* x 


CTACCTGTGG 


CTACAAGGTT 


1050 




TCCAAACCAA 


AGGCTCGGCT 


CTGCTCACAG 


GAGATTAGAT 


ACTTAGGGCT 


1100 




AAAATTATCC 


AAAGGCACCA 


GGGCCCTCAG 


TGAGGAACGT 


ATCCAGCCTA 


1150 




TACTGGCTTA 


TCCTCATCCC 


AAAACCCTAA 


AGCAACTAAG 


AGGGTTCCTT 


1200 




GGCATAACAG 


GTTTCTGCCG 


AAAACAGATT 


CCCAGGTACA 


CCCCAATAGC 


1250 


30 


CAGACCATTA 


TATACACTAA 


TTAGGGAAAC 


TCAGAAAGCC 


AATACCTATT 


1300 




TAGTAAGATG 


GACACCTACA 


GAAGTGGCTT 


TCCAGGCCCT 


AAAGAAGGCC 


1350 




CTAACCCAAG 


CCCCAGTGTT 


CAGCTTGCCA 


ACAGGGCAAG 


ATTTTTCTTT 


1400 




ATATGCCACA 


GAAAAAACAG 


GAATAGCTCT 


AGGAGTCCTT 


ACGCAGGTCT 


1450 




CAGGGATGAG 


CTTGCAACCC 


GTGGTATACC 


TGAGTAAGGA 


AATTGATGTA 


1500 


35 


GTGGCAAAGG 


GTTGGCCTCA 


TTGTTTATGG 


GTAATGGCGG 


CAGTAGCAGT 


1550 




CTTAGTATCT 


GAAGCAGTTA 


AAATAATACA 


GGGAAGAGAT 


CTTACTGTGT 


1600 
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GGACATCTCA 


TGATGTGAAC 


GGCATACTCA 


CTGCTAAAGG 


AGACTTGTGG 


1650 


TTGTCAGACA 


ACCATTTACT 


TAATTATCAG 


GCTCTATTAC 


TTGAAGAGCC 


1700 


AGTGCTGAGA 


CTGCGCACTT 


GTGCAACTCT 


TAAACCCGCC 


ACATTTCTTC 


1750 


CAGACAATGA 


AGAAAAGATA 


GAACATAACT 


GTCAACAAGT 


AATTGCTCAA 


1800 


ACCTATGCTG 


CTCGAGGGGA 


CCTTCTAGAG 


GTTCCCTTGA 


CTGATCCCGA 


1850 


CCTCAACTTG 


TATACTGATG 


GAAGTTCCTT 


GGCAGAAAAA 


GGACTTCGAA 


1900 


AAGCGGGGTA 


TGCAGTGATC 


AGTGATAATG 


GAATACTTGA 


AAGTAATCGC 


1950 


t imm« orr»o/"*iv o 
C 1 C ACTCC ACi 


GAAL I AC* lOt 


mo TV 0/"* r PO f OS 


O JV TV O'P TV TV *P TV /*• 

GAACTAATAG 


CCCTCACTTG 


2000 


GGCACTAGAA 


TTAGGAGAAG 


GAAAAAGGGT 


AAATATATAT 


TCAGACTCTA 


2050 


AGTATGCTTA 


CCTAGTCCTC 


CATGCCCATG 


CAGCAATATG 


GAGAGAGAGG 


2100 


GAATTCCTAA 


CTTCTGAGGG 


AACACCTATC 


AACCATCAGG 


AAGCCATTAG 


2150 


GAGATTATTA 


TTGGCTGTAC 


AGAAACCTAA 


AGAGGTGGCA 


GTCTTACACT 


2200 


GCCAGGGTCA 


TCAGGAAGAA 


GAGGAAAGGG 


AAATAGAAGG 


CAATCGCCAA 


2250 


GCGGATATTG 


AAGCAAAAAA 


AGCCGCAAGG 


CAGGACTCTC 


CATTAGAAAT 


2300 


GCTT 










2304 



(2) INFORMATION FOR SEQ ID NO: 94: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2364 base pairs 
20 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 



ATGATCCAGC 


AGCAGGACNG 


AGGGTGCCCG 


GGGCAAGCGC 


CAGCCCATGC 


50 


CATCACCCTC 


ACAGAGCCCC 


AGGTATGCTT 


GACCATTGAG 


GGTCAGAAGG 


100 


GTNACTGTCT 


CCTGGACACT 


GGCGGNGCCT 


TCTCAGTCTT 


ACTTTCCTGT 


150 


CCTGGACAAC 


TGTCCTCCAG 


ATCTGTCACT 


GTCCGAGGGG 


TCCTAGGACA 


200 


GCCAGTCACT 


AGATACTTCT 


CCCAGCCACT 


AAGTTGTGAC 


TGGGGAACTT 


250 


TACTCTTCCC 


ACATGCTTTT 


CTAATTATGC 


CTGAAAGCCC 


CACTCTCTTG 


300 


TTGGGGAGAG 


ACATTCTAGC 


AAAAGCAGGG 


GCCATTATAC 


ATGTGAATAT 


350 


AGGAGAAGGA 


ACAACTGTTT 


GTTGTCCCCT 


GCTTGAGGAA 


GGAATTAATC 


400 


CTGAAGTCCG 


GGCAACAGAA 


GGACAATATG 


GACAAGCAAA 


GAATGCCCGT 


450 


CCTGTTCAAG 


TTAAACTAAA 


GGATTCCACC 


TCCTTTCCCT 


ACCAAAGGCA 


500 


GTACCCCCTC 


AGACCCGAGA 


CCCAACAAGA 


ACTCCAAAAG 


ATTGTAAAGG 


550 


ACCTAAAAGC 


CCAAGGCCTA 


GTAAAACCAA 


GCAATAGCCC 


TTGCAAGACT 


600 
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L.Lnn x X X Ax»w* 


GAGTAAGGAA 


ACCCAACGGA 


CAGTGGAGGT 


TAGTGCAAGA 


650 




ACTPAGGATT 


AT C AATG AG G 


CTGTTGTTCC 


TCTATACCCA 


GCTGTACCTA 


700 




ACCCTTATAC 
r>w w w x a i» x ** * 


AGTGCTTTCC 


CAAATACCAG 


AGGAAGCAGA 


GTGGTTTACA 


750 




GTCCTGGACC 


TTAAGGATGC 


CTTTTTCTGC 


ATCCCTGTAC 


GTCCTGACTC 


800 


5 


TCAATTCTTG 


TTTGCCTTTG 


AAGATCCTTT 


GAACCCAACG 


TCTCAACTCA 


850 




CCTGGACTGT 

Wv X w/w;«»\^ A w; x 


TTTACCCCAA 


GGGTTCAGGG 


ATAGCCCCCA 


TCTATTTGGC 


900 




PAGGPATTAG 


CCCAAGACTT 


GAGTCAATTC 


TCATACCTGG 


ACACTCTTGT 


950 




rfTTCAGTAC 
w x x t/ttu x 


ATGGATGATT 


TACTTTTAGT 


CGCCCGTTCA 


GAAACCTTGT 


1000 




GCCATCAAGC 


CACCCAAGAA 


CTCTTAACTT 


TCCTCACTAC 


CTGTGGCTAC 


1050 


10 


AAGGTTTCCA 

*vnvw AAA. wwrc 


AACCAAAGGC 


TCGGCTCTGC 


TCACAGGAGA 


TTAGATACTN 


1100 




AGGGCTAAAA 


TTATCCAAAG 

X X ** X W ****** w 


GCACCAGGGC 


CCTCAGTGAG 


GAACGTATCC 


1150 




AGCCTATACT 

<TkVj v» w x r» x n w a 


GGCTTATCCT 

vj\Jw a x**x wwx 


CATCCCAAAA 


CCCTAAAGCA 


ACTAAGAGGG 


1200 




TTPPTTfifiPA 

X l«wl lUUvn 


TAACAGGTTT 

X ♦xT* wilWw XXX 


CTGCCGAAAA 


CAGATTCCCA 


GGTACASCCC 


1250 




AATAGCCAGA 


CCATTATATA 


CACTAATTAN 


GGAAACTCAG 


AAAGCCAATA 


1300 


15 


PCTATTTART 

V* w X ill X X *»w A 


AAGATGGACA 


CCTACAGAAG 


TGGCTTTCCA 


GGCCCTAAAG 


1350 




AAGRfrrTAA 


PPPAAGCCCC 


AGTGTTCAGC 

s«w X VJ X X w*»w w 


TTGCCAACAG 


GGCAAGATTT 


1400 




TTPTTT A T 1 A *P 


RPPAPAOAAA 


AAACAGGAAT 


AGCTCTAGGA 

M\\J W X w X f» w w «* 


GTCCTTACGC 

w x w w x x *» ww w 


1450 




nw 1 w x w/iw" 1* 


GATGAGPTTG 


PAAPCCGTGG 


TATACCTGAG 
ini nw w x vjfiu 


TAAGGAAATT 


1500 




GATGTAGTGG 
Vjf \ iul t\\J X Ow - 


PAAAGGGTTG 


GCCTCATNGT 

V? WW X wil 1 11VJ 1 


TTATGGGTAA 

x x n x www x 


TGGNGGCAGT 

X w xJ i 1 w \ J w Ow X 


1550 




AGPAGTPTNA 


GTATPTGAAG 


PARTTAAAAT 


AATACAGGGA 


AGAGATCTTN 

nwnvjn ± w x aii 


1600 




PTGTGTGGAP 


ATPTCATGAT 

nlvl wn x wr\ X 


GTGAACGGCA 


TACTSRCTGC 

X *»w X Ji\wl VJ w 


TAAAGGAGAC 


1650 






CAGACAACCA 


TTTACTTAAN 

x x x«*w-x x ruiti 


TAYCAGGCYY 

X**X W^**W W W X X 


TATTACTTGA 


1700 




AGAGCCAGTG 


CTGNGACTGC 


GCACTTGTCC 


AACTCTTAAA 


CCCAAACTTA 


1750 




TGCTGCCCAG 

x x w w w wnu 


AAGGATCTTT 


NTAGAGGTCC 


CCTTAGCCAA 


CCCTGACCTC 


1800 


25 


AACTATATAT 

nnv x *x a n x c\ x 


ATACTGATGG 

** X ** w< X V#*» X ww 


AAGTTCGTTT 


GTAGAAAAGG 


GATTACAAAG 


1850 




GGNAGGATAT 


NCCATAGGTG 

11 W w «* X *» w \J X 


TTAGTGATAA 


AGCAGTACTT 


GAAAGTAAGC 


1900 




CTCTTCCCCC 


CCAGGGACCA 

w wit w VJ w ** w w tk 


GCGCCCCCGT 


TAGCAGAACT 


AGTGGCACTG 


1950 




ACCCCGCGAG 


CCTTAGAACT 


TTGGAAAGGG 


AGGAGGATAA 


ATGTGTATAC 


2000 




AGATAGCAAG 


TATGCTTATC 


TAATCCGAAA 


TGCCCATGTT 


f~*f\ ft rp TV rp/~> /""• ft 

btAA 1 A 1 L»L» A 




30 


AAGAAAGGGA 


GTTCCTAACC 


TCTGGGGGAA 


CCCCCATTAA 


ATACCACAAG 


2100 




TTAATCATGG 


AGTTATTGCA 


CACAGTGCAA 


AAACTCAAGG 


AGGTGGAAGT 


2150 




CTTACACTGC 


CAAAGCCATC 


AGAAAAGGGA 


AAGAGGGGAA 


GAGCAGCATA 


2200 




AGTGGCTACA 


GAGGCAAGGA 


AAGACTAGCA 


GAAAGGAAAG 


AGAGAAAGAG 


2250 




ACAGAAAGTC 


AGAGAGAGAG 


AGAGGAAGAG 


ACAGAGCACA 


AAGAGGGAGT 


2300 


35 


CAGAGAGAGA 


GAGAGACAGA 


GAGTCAGAGA 


GAAGGAAAGA 


GAGAGAGGAA 


2350 




GAGACAAAGA 


ATGAH 








2365 
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(2) INFORMATION FOR SEQ ID NO: 95: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 768 amino acids 
5 (B) TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 



SSSRTEGARG 


KCQPMPSPSE 


PRVCLTIESQ 


EVNCLLDTGA 


AFSVLLSCPR 


50 


QLSSRSVTIR 


GVLGQPVTTX 


r SQFL5CUWG 


1LLFSHAFLI 


MPESPTPLLG 


100 


RDILAKAGAI 


IHLNIGKGIP 


ICCPLLEEGI 


NPEVWAIEGQ 


YGQAKNARPV 


150 


QVKLKDSASF 


PYQRKYPLRP 


EALQGXQKIV 


KDLKAQGLVK 


PCSSPCNTPI 


200 


LGVRKPNGQW 


RLVQDLRIIN 


EAVFPLYPAV 


SSPYTLLSLI 


PEEAEWFTVL 


250 


DLKDAFFCIP 


VRPDSQFLFA 


FEDPLNPTSQ 


LTWTVLPQGF 


RDSPHLFGQA 


300 


LAQDLSQFSY 


LDTLVLQYVD 


DLLLVARSET 


LCHQATQELL 


TFLTTCGYKV 


350 


SKPKARLCSQ 


EIRYLGLKLS 


KGTRALSEER 


IQPILAYPHP 


KTLKQLRGFL 


400 


GITGFCRKQI 


PRYTPIARPL 


YTLIRETQKA 


NTYLVRWTPT 


EVAFQALKKA 


450 


LTQAPVFSLP 


TGQDFSLYAT 


EKTGIALGVL 


TQVSGMSLQP 


WYLSKEIDV 


500 


VAKGWPHCLW 


VMAAVAVLVS 


EAVKIIQGRD 


LTVWTSHDVN 


GILTAKGDLW 


550 


LSDNHLLNYQ 


ALLLEEPVLR 


LRTCATLKPA 


TFLPDNEEKI 


EHNCQQVIAQ 


600 


TYAARGDLLE 


VPLTDPDLNL 


YTDGSSLAEK 


GLRKAGYAVI 


SDNGILESNR 


650 


LTPGTSAHLA 


ELIALTWALE 


LGEGKRVNIY 


SDSKYAYLVL 


HAHAAIWRER 


700 


EFLTSEGTPI 


NHQEAIRRLL 


LAVQKPKEVA 


VLHCQGHQEE 


EEREIEGNRQ 


750 


ADIEAKKAAR 


QDSPLEML 








768 



25 (2) INFORMATION FOR SEQ ID NO: 96: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 114 amino acids 

(B) TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

30 

SSSRTEGARG KCQPMPSPSE PRVCLTIESQ EVNCLLDTGA AFSVLLSCPR 50 
QLSSRSVTIR GVLGQPVTTY FSQPLSCDWG TLLFSHAFLI MPESPTPLLG 100 
RDILAKAGAI IHLN 114 



35 (2) INFORMATION FOR SEQ ID NO: 97: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: amino acids 

(B) TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

5 IGKGI PI CCPLLEEG I NPEVWA I EGQY GQAKNARPV 

QWLKDSASFPYQRKYPLRPEALQGXQKIVKDLKAQGLVKPCSSPCNTPI 
LGVRKPNGQWRLVQDLRI I NEAVFPLY PAVS SPYTLLSLI PEE AE WFTVL 
DLKDAFFCIPVRPDSQFLFAFEDPLNPTSQLTWTVLPQGFRDSPHLFGQA 
LAQDLSQFSYLDTLVLQYVDDLLLVARSETLCHQATQELLTFLTTCGYKV 

10 SKPKARLCSQEIRYLGLKLSKGTRALSEERIQPILAYPHPKTLKQLRGFL 
GITGFCRKQIPRYTPIARPLYTLIRETQKANTYLVRWTPTEVAFQALKKA 
LTQAPVFSLPTGQDFSLYATEKTGIALGVLTQVSGMSLQPWYLSKEIDV 
VAKGWPHCLWVMAAVAVLVSEAVKIIQGRDLTVWTSHDVNGILTAKGDLW 
LSDNHLLNYQALLLEEPVLRLRTCATLKPATFLPDNEEKIEHNCQQVIAQ 

15 TYAARGDLLEVPLTDPDLNLYTDGSSLAEKGLRKAGYAVISDNGILESNR 
LTPGTSAHLAELIALTWALELGEGKRVNIYSDSKYAYLVLHAHAAIWRER 
EFLTSEGTPINHQEAIRRLLLAVQKPKEVAVLHCQGHQEEEEREIEGNRQ 
ADIEAKKAARQDSPLEML 

20 

(2) INFORMATION FOR SEQ ID NO: 98: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: amino acids 

(B) TYPE: peptide 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

LYTDGSSLAEKGLRKAGYAVI SDNGILESNR 

LTPGTSAHLAELIALTWALELGEGKRVNIYSDSKYAYLVLHAHAAIWRER 
EFLTSEGTPINHQEAIRRLLLAVQKPKEVAVLHCQGHQEEEEREIEGNRQ 
3 0 ADIEAKKAARQDSPLEML 

(2) INFORMATION FOR SEQ ID NO: 99 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 
35 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 
AGGAGTAAGG AAACCCAACG GAC 23 

5 (2) INFORMATION FOR SEQ ID NO: 100 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 
10 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 
TAAGAGTTGC ACAAGTGCG 19 

(2) INFORMATION FOR SEQ ID NO: 101 
15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

TCAGGGATAG CCCCCATCTA T 21 

(2) INFORMATION FOR SEQ ID NO: 102 
(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 24 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 
30 AACCCTTTGC CACTACATCA ATTT 24 

(2) INFORMATION FOR SEQ ID NO: 103 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 
35 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 
AGCAGCAGGA CTGAGGGT 18 



5 (2) INFORMATION FOR SEQ ID NO: 104 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 
CTGTCCGTTG GGTTTCCTTA CTCCT 25 

(2) INFORMATION FOR SEQ ID NO: 105 
15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

GACAGCAAAT GGGTATTCCT TTCC 24 



(2) INFORMATION FOR SEQ ID NO: 106 
(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 24 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 
30 AGGAGTAAGG AAACCCAACG GACA 24 



(2) INFORMATION FOR SEQ ID NO: 107 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 
35 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 
TGTATATAAT GGTCTGGCTA TTGGG 25 



5 (2) INFORMATION FOR SEQ ID NO: 108 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 
TTCGGCAGAA ACCTGTTATG CCAAGG 26 



(2) INFORMATION FOR SEQ ID NO: 109 
15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

GGCTCTGCTC ACAGGAGATT AGATAC 26 



(2) INFORMATION FOR SEQ ID NO: 110 
(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 26 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 
30 AAAGGCACCA GGGCCCTCAG TGAGGA 26 



(2) INFORMATION FOR SEQ ID NO: 111 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: base pairs 
35 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 
GGTTTAAGAG TTGCACAAGT GCGCAGTC 28 

5 (2) INFORMATION FOR SEQ ID NO: 112: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 310 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
10 <D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

GCTTATAGAA GGACCCCTAG TATGGGGTAA TCCCCTCTGG GAAACCAAGC CCCAGTACTC 60 

AGCAGGAAAA ATAGAATAGG AAACCTCACA AGGACATACT TTCCTCCCCT CCAGATGGCT 120 

15 AGCCACTGAG GAAGGAAAAA TACTTTCACC TGCAGCTAAC CAACAGAAAT TACTTAAAAC 180 

CCTTCACCAA ACCTTCCACT TAGGCATTGA TAGCACCCAT CAGATGGCCA AATTATTATT 240 

TACTGGACCA GGCCTTTTCA AAACTATCAA GAAGATAGTC AGGGGCTGTG AAGTGTGCCA 300 

AAGAAATAAT 310 

20 (2) INFORMATION FOR SEQ ID NO: 113: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 103 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

Leu lie Glu Gly Pro Leu Val Trp Gly Asn Pro Leu Trp Glu Thr Lys 
15 10 15 

30 Pro Gin Tyr Ser Ala Gly Lys lie Glu Xaa Glu Thr Ser Gin Gly His 

20 25 30 

Thr Phe Leu Pro Ser Arg Trp Leu Ala Thr Glu Glu Gly Lys lie Leu 

35 40 45 

Ser Pro Ala Ala Asn Gin Gin Lys Leu Leu Lys Thr Leu His Gin Thr 
35 50 55 60 

Phe His Leu Gly lie Asp Ser Thr His Gin Met Ala Lys Leu Leu Phe 



5/9/2006, EAST Version: 2.0.3.0 



WO 98/23755 



PCMB97/01482 



183 

65 70 75 80 

Thr Gly Pro Gly Leu Phe Lys Thr lie Lys Lys lie Val Arg Gly Cys 

85 90 95 

Glu Val Cys Gin Arg Asn Asn 
5 100 

(2) INFORMATION FOR SEQ ID NO: 114: 
<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 635 base pairs 
10 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 

15 CCCTGTATCT TTAACCTCCT TGTTAAGTTT GTCTCTTCCA GAATCAAAAC TGTAAAACTA 60 
CAAATTGTTC TTCAAATGGA GCACCAGATG GAGTCCATGA CTAAGATCCA CCGTGGACCC 120 
CTGGACCGGC CTGCTAGCCC ATGCTCCGAT GTTAATGACA TTGAAGGCAC CCCTCCCGAG 180 
GAAATCTCAA CTGCACAACC CCTACTATGC CCCAATTCAG CGGGAAGCAG TTAGAGCGGT 240 
CATCAGCCAA CCTCCCCAAC AGCACTTGGG TTTTCCTGTT GAGAGGGGGG ACTGAGAGAC 300 

20 AGGACTAGCT GGATTTCCTA GGCCAACGAA GAATCCCTAA GCCTAGCTGG GAAGGTGACT 360 
GCATCCACCT CTAAACATGG GGCTTGCAAC TTAGCTCACA CCCGACCAAT CAGAGAGCTC 420 
ACTAAAATGC TAATTAGGCA AAAATAGGAG GTAAAGAAAT AGCCAATCAT CTATTGCCTG 480 
AGAGCACAGC GGGAGGGACA AGGATCGGGA TATAAACCCA GGCATTCGAG CCGGCAACGG 540 
CAACCCCCTT TGGGTCCCCT CCCTTTGTAT GGGCGCTCTG TTTTCACTCT ATTTCACTCT 600 

25 ATTAAATCTT GCAACTGAAA AAAAAAAAAA AAAAA 635 

(2) INFORMATION FOR SEQ ID NO: 115: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 77 amino acids 
30 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 
35 Pro Cys lie Phe Asn Leu Leu Val Lys Phe Val Ser Ser Arg lie Lys 

15 10 15 
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Thr Val Lys Leu Gin He Val Leu Gin Met Glu His Gin Met Glu Ser 

20 25 30 

Met Thr Lya He His Arg Gly Pro Leu Asp Arg Pro Ala Ser Pro Cys 
35 40 45 

5 Ser Asp Val Aan Asp He Glu Gly Thr Pro Pro Glu Glu He Ser Thr 

50 55 60 

Ala Gin Pro Leu Leu Cys Pro Asn Ser Ala Gly Ser Ser 
65 70 75 

(2) INFORMATION FOR SEQ ID NO; 116s 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 
TGGGGTTCCA TTTGTAAGAC CATCTGTAGC TT 32 

(2) INFORMATION FOR SEQ ID NO: 117: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1481 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 



ATGGCCCTCC 


CTTATCATAC 


TTTTCTCTTT 


ACTGTTCTCT 


TACCCCCTTT 


CGCTCTCACT 


60 


GCACCCCCTC 


CATGCTGCTG 


TACAACCAGT 


AGCTCCCCTT 


ACCAAGAGTT 


TCTATGAAGA 


120 


ACGCGGCTTC 


CTGGAAATAT 


TGATGCCCCA 


TCATATAGGA 


GTTTATCTAA 


GGGAAACTCC 


180 


ACCTTCACTG 


CCCACACCCA 


TATGCCCCGC 


AACTGCTATA 


ACTCTGCCAC 


TCTTTGCATG 


240 


CATGCAAATA 


CTCATTATTG 


GACAGGGAAA 


ATGATTAATC 


CTAGTTGTCC 


TGGAGGACTT 


300 


GGAGCCACTG 


TCTGTTGGAC 


TTACTTCACC 


CATACCAGTA 


TGTCTGATGG 


GGGTGGAATT 


360 


CAAGGTCAGG 


CAAGAGAAAA 


ACAAGTAAAG 


GAAGCAATCT 


CCCAACTGAC 


CCGGGGACAT 


420 


AGCACCCCTA 


GCCCCTACAA 


AGGACTAGTT 


CTCTCAAAAC 


TACATGAAAC 


CCTCCGTACC 


480 


CATACTCGCC 


TGGTGAGCCT 


ATTTAATACC 


ACCCTCACTC 


GGCTCCATGA 


GGTCTCAGCC 


540 
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CAAAACCCTA CTAACTGTTG GATGTGCCTC 
CCTGTTCCTG AACAATGGAA CAACTTCAGC 
GGACCTCTTG TTTCCAATCT GGAAATAACC 
AGCAATACTA TAGACACAAC CAGCTCCCAA 
5 ATAGTCTGCC TACCCTCAGG AATATTTTTT 
AATGGCTCTT CAGAATCTAT GTGCTTCCTC 
ACTGAACAAG ATTTATACAA TCATGTCGTA 
CTTCCTTTTG TTATCAGAGC AGGAGTGCTA 
ACAACCTCTA CTCAGTTCTA CTACAAACTA 
GTCACTGACT CCCTGGTCAC CTTGCAAGAT 
CAAAATCGAA GAGCTTTAGA CTTGCTAACC 
GGAGAAGAAC GCTGTTATTA TGTTAATCAA 
ATTCGAGATC GAATACAATG TAGAGCAGAG 
CTCAGCCAAT GGATGCCCTG GGTTCTCCCC 
TTACTCCTCT TTGGACCCTG TATCTTTAAC 
GAAGCTGTAA AGCTACAGAT GGTCTTACAA 



CCCCTGCACT 


TCAGGCCATA 


CATTTCAATC 


600 


ACAGAAATAA 


ACACCACTTf* 

*» V^*»S,* W*»^# X X v-» 


X X X X flVJ x t\ 


660 

wow 


CATACCTCAA 


ACCTCACCTG 


TGTAAAATTT 

X w X ******** X X X 


720 


TGCATCAGGT 


GGGTAACACC 


TCCCACACGA 


780 


GTCTGTGGTA 


CCTCAGCCTA 


T C ATTG TTTfi 

x x x w xxx w 


840 


TCATTCTTAG 


TGCCCCCTAT 


GACCATCTAC 

w*l w w*» x v X nv 


900 


CCTAAGCCCC 


ACAACAAAAG 


AGTACCCATT 


960 


GGCAGACTAG 


GTACTGGCAT 


TGGCAGTATC 


1020 


TCTCAAGAAA 


TAAATGGTGA 


CATGGAACAG 


1080 


CAACTTAACT 


CCCTAGCAGC 


AGTAGTCCTT 


1140 


GCCAAAAGAG 


GGGGAACCTG 


TTTATTTTTA 


1200 


TCCAGAATTG 


TCACTGAGAA 


AGTTAAAGAA 


1260 


GAGCTTCAAA 


ACACCGAACG 


CTGGGGCCTC 


1320 


TTCTTAGGAC 


CTCTAGCAGC 


TCTAATATTG 


1380 


CTCCTTGTTA 


AGTTTGTCTC 


TTCCAGAATT 


1440 


ATGGAACCCC 


A 




1481 



(2) INFORMATION FOR SEQ ID NO: 118: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 493 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
<ii) TYPE DE MOLECULE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 

Met Ala Leu Pro Tyr His Thr Phe Leu Phe Thr Val Leu Leu Pro Pro 

15 10 15 

Phe Ala Leu Thr Ala Pro Pro Pro Cys Cys Cys Thr Thr Ser Ser Ser 

20 25 30 

Pro Tyr Gin Glu Phe Leu Xaa Arg Thr Arg Leu Pro Gly Asn lie Asp 

35 40 45 

Ala Pro Ser Tyr Arg Ser Leu Ser Lys Gly Asn Ser Thr Phe Thr Ala 

50 55 60 

His Thr His Met Pro Arg Asn Cys Tyr Asn Ser Ala Thr Leu Cys Met 
65 70 75 80 

His Ala Asn Thr His Tyr Trp Thr Gly Lys Met He Asn Pro Ser Cys 
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85 90 95 

Pro Gly Gly Leu Gly Ala Thr Val Cys Trp Thr Tyr Phe Thr His Thr 

100 105 110 

Ser Met Ser Asp Gly Gly Gly He Gin Gly Gin Ala Arg Glu Lys Gin 
5 115 120 125 

Val Lys Glu Ala He Ser Gin Leu Thr Arg Gly His Ser Thr Pro Ser 

130 135 140 

Pro Tyr Lys Gly Leu Val Leu Ser Lys Leu His Glu Thr Leu Arg Thr 
145 150- 155 160 

10 His Thr Arg Leu Val Ser Leu Phe Asn Thr Thr Leu Thr Arg Leu His 

165 170 175 

Glu Val Ser Ala Gin Asn Pro Thr Asn Cys Trp Met Cys Leu Pro Leu 

180 185 190 

His Phe Arg Pro Tyr He Ser He Pro Val Pro Glu Gin Trp Asn Asn 
15 195 200 205 

Phe Ser Thr Glu He Asn Thr Thr Ser Val Leu Val Gly Pro Leu Val 

210 215 220 

Ser Asn Leu Glu He Thr His Thr Ser Asn Leu Thr Cys Val Lys Phe 
225 230 235 240 

20 Ser Asn Thr He Asp Thr Thr Ser Ser Gin Cys He Arg Trp Val Thr 

245 250 255 

Pro Pro Thr Arg He Val Cys Leu Pro Ser Gly He Phe Phe Val Cys 

260 265 270 

Gly Thr Ser Ala Tyr His Cys Leu Asn Gly Ser Ser Glu Ser Met Cys 
25 275 280 285 

Phe Leu Ser Phe Leu Val Pro Pro Met Thr He Tyr Thr Glu Gin Asp 

290 295 300 

Leu Tyr Asn His Val Val Pro Lys Pro His Asn Lys Arg Val Pro He 
305 310 315 320 

30 Leu Pro Phe Val He Arg Ala Gly Val Leu Gly Arg Leu Gly Thr Gly 

325 330 335 

He Gly Ser He Thr Thr Ser Thr Gin Phe Tyr Tyr Lys Leu Ser Gin 

340 345 350 

Glu He Asn Gly Asp Met Glu Gin Val Thr Asp Ser Leu Val Thr Leu 
35 355 360 365 

Gin Asp Gin Leu Asn Ser Leu Ala Ala Val Val Leu Gin Asn Arg Arg 
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370 375 380 

Ala Leu Asp Leu Leu Thr Ala Lys Arg Gly Gly Thr Cys Leu Phe Leu 
385 390 395 400 

Gly Glu Glu Arg Cy9 Tyr Tyr Val Asn Gin Ser Arg lie Val Thr Glu 
5 405 410 415 

Lys Val Lys Glu lie Arg Asp Arg He Gin Cys Arg Ala Glu Glu Leu 

420 425 430 

Gin Asn Thr Glu Arg Trp Gly Leu Leu Ser Gin Trp Met Pro Trp Val 
435 440 445 

10 Leu Pro Phe Leu Gly Pro Leu Ala Ala Leu He Leu Leu Leu Leu Phe 

450 455 460 

Gly Pro Cys He Phe Asn Leu Leu Val Lys Phe Val Ser Ser Arg He 
465 470 475 480 

Glu Ala Val Lys Leu Gin Met Val Leu Gin Met Glu Pro 
15 485 490 

(2) INFORMATION FOR SEQ ID NO: 119: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 32 base pairs 
20 (B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 
25 TCAAAATCGA AGAGCTTTAG ACTTGCTAAC CG 32 



(2) INFORMATION FOR SEQ ID NO: 120: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1329 base pairs 
30 ( B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 
35 TCAAAATCGA AGAGCTTTAG ACTTGCTAAC CGCCAAAAGA GGGGGAACCT GTTTATTTTT 60 
AGGGGAAGAA TGCTGTTAGT ATGTTAATCA ATCTGGAATC ATTACTGAGA AAGTTAAAGA 120 
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AATTTGAGAT CGAATATAAT GTAGAGCAGA 
CCTCAGCCAA TGGATGCCCT GGACTCTCCC 
TTTACTCCTC TTTGGACCCT GTATCTTCAA 
TGAAGCTGTA AAGCTACAAA TAGTTCTTCA 
5 AATCTACCGT GGACCCCTGG ACCGGCCTGC 
AGTCACCCCT CCCGAGGAAA TCTCAACTGC 
AAGCAGTTAG AGCAGTTGTC AGCCAACCTC 
GGGTGGACTG AGAGACAGGA CTAGCTGGAT 
ANCTGGGAAG GTGACCGCAT CCATCTTTAA 
ACCAATCAGA GAGCTCACTA AAATGCTAAT 
AATCATCTAT TGCCTGAGAG CACAGCGGGA 
TTCAAGCCAG CAACAGCAAC CCCCTTTGGG 
CACTCTATTT CACTCTATTA AATCATGCAA 
CTCAAGCTGA GCTTTTGTTC GCCATCCACC 
GCTGACTTCC ATCCCTTTGG ATCCAGCAGA 
ACCCATTGCC ACTCCCGATC AGGCTAAAGG 
TGGGTTTGTC CTAATAGAAC TGAACACTGG 
CCACGGCTTC TAATAGAGCT ATAACACTCA 
TCTGTGAGGC CAAGAACCCC AGGTCAGAGA 
CCCACTGCCA TTTTGGTAGC GGCCCACCAC 
CCAGTAACA 



GGACCTTCAA AACACTGCAC CCTGGGGCCT 180 

CTTCTTAGGA CCTCTAGCAG CTATAATATT 240 

CTTCCTTGTT AAGTTTGTCT CTTCCAGAAT 300 

AATGGAACCC CAGATGCAGT CCATGACTAA 360 

TAGACTATGC TCTGATGTTA ATGACATTGA 420 

ACAACCCCTA CTACACTCCA ATTCAGTAGG 480 

CCCAACAGTA CTTGGGTTTT CCTGTTGAGA 540 

TTCCTAGGCT GACTAAGAAT CCCNAAGCCT 600 

ACATGGGGCT TGCAACTTAG CTCACACCCG 660 

CAGGCAAAAA CAGGAGGTAA AGCAATAGCC 720 

AGGACAAGGA TTGGGATATA AACTCAGGCA 780 

TCCCCTCCCA TTGTATGGGA GCTCTGTTTT 840 

CTGCACTCTT CTGGTCCGTG TTTTTTATGG 900 

ACTGCTGTTT GCCACCGTCA CAGACCCGCT 960 

GTGTCCACTG TGCTCCTGAT CCAGCGAGGT 1020 

CTTGCCATTG TTCCTGCATG GCTAAGTGCC 1080 

TCACTGGGTT CCATGGTTCT CTTCCATGAC 1140 

CCGCATGGCC CAAGATTCCA TTCCTTGGTA 1200 

ANGTGAGGCT TGCCACCATT TGGGAAGTGG 1260 

CATCTTGGGA GCTGTGGGAG CAAGGATCCC 1320 

1329 



(2) INFORMATION FOR SEQ ID NO: 121: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 162 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 

Gin Asn Arg Arg Ala Leu Asp Leu Leu Thr Ala Lys Arg Gly Gly Thr 

15 10 15 

Cys Leu Phe Leu Gly Glu Glu Cys Cys Xaa Tyr Val Asn Gin Ser Gly 

20 25 30 

lie lie Thr Glu Lys Val Lys Glu lie Xaa Asp Arg lie Xaa Cys Arg 
35 40 45 
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Ala Glu Asp Leu Gin Asn Thr Ala Pro Trp Gly Leu Leu Ser Gin Trp 

50 55 60 

Met Pro Trp Thr Leu Pro Phe Leu Gly Pro Leu Ala Ala lie lie Phe 
65 70 75 80 

5 Leu Leu Leu Phe Gly Pro Cys lie Phe Asn Phe Leu Val Lys Phe Val 

85 90 95 

Ser Ser Arg He Glu Ala Val Lys Leu Gin He Val Leu Gin Met Glu 

100 105 110 

Pro Gin Met Gin Ser Met Thr Lys He Tyr Arg Gly Pro Leu Asp Arg 
10 115 120 125 

Pro Ala Arg Leu Cys Ser Asp Val Asn Asp He Glu Val Thr Pro Pro 

130 135 140 

Glu Glu He Ser Thr Ala Gin Pro Leu Leu His Ser Asn Ser Val Gly 
145 150 155 160 

15 Ser Ser 

{2) INFORMATION FOR SEQ ID NO: 122: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 
20 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 
25 GGCATTGATA GCACCCATCA G 21 

(2) INFORMATION FOR SEQ ID NO: 123: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 
30 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 
35 CATGTCACCA GGGTGGAATA G 21 
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(2) INFORMATION FOR SEQ ID NO: 124: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 758 base pairs 

(B) TYPE: nucleotide 

5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 



GGCATTGATA 


GCACCCATCA 


GATGGCCAAA 


TCATTATTTA 


CTGGACCAGG 


CCTTTTCAAA 


60 


ACTATCAAGC 


AGATAGGGCC 


CGTGAAGCAT 


GCCAAAGAAA 


TAATCCCCTG 


CCTTATCGCC 


120 


ATGTTCCTTC 


AGGAGAACAA 


AGAACAGGCC 


ATTACCCAGG 


GGAAGACTGG 


CAACTAGATT 


180 


TTACCCACAT 


GGCCAAATGT 


CAGGGATTTC 


AGCATCTACT 


AGTCTGGGCA 


GATACTTTCA 


240 


CTGGTTGGGT 


GGAGTCTTCT 


CCTTGTAGGA 


CAGAAAAGAC 


CCAAGAGGTA 


ATAAAGGCAC 


300 


TAATGAAATA 


ATTCCCAGAT 


TTGGACTTCC 


CCCAGGATTA 


CAGGGTGACA 


ATGGCCCCGC 


360 


TTTCAAGGCT 


GCAGTAACCC 


AGGGAGTATC 


CCAGGTGTTA 


GGCATACAAT 


ATCACTTACA 


420 


CTGTGCCTGG 


AGGCCACAAT 


CCTCCAGAAA 


AGTCAAGAAA 


ATGAATGAAA 


CACTCAAAGA 


480 


TCTAAAAAAG 


CTAACCCAAG 


AAACCCACAT 


TGCATGACCT 


GTTCTGTTGC 


CTATAACCTT 


540 


ACTAAGAATC 


CATAACTATC 


CCCCAAAAAG 


CAGGACTTAG 


CCCATACGAG 


ATGCTATATG 


600 


GATGGCCTTT 


CCTAACCAAT 


GACCTTGTGC 


TTGACTGAGA 


AATGGCCAAC 


TTAGTTGCAG 


660 


ACATCACCTC 


CTTAGCCAAA 


TATCAACAAG 


TTCTTAAAAC 


ATCACAGGGA 


ACCTGTCCCC 


720 


GAGAGGAGGG 


AAAGGAACTA 


TTCCACCCTG 


GTGACATG 






758 



(2) INFORMATION FOR SEQ ID NO: 126: 
25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
30 (ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 
CGGACATCCA AAGTGATGGG AAACG 25 



(2) INFORMATION FOR SEQ ID NO: 127: 
35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 
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(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 

GGACAGGAAA GTAAGACTGA GAAGGC 26 



(2) INFORMATION FOR SEQ ID NO: 128: 
(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 26 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 

CCTAGAACGT ATTCTGGAGA ATTGGG 26 

(2) INFORMATION FOR SEQ ID NO: 129: 
(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 26 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 

TGGCTCTCAA TGGTCAAACA TACCCG 26 



(2) INFORMATION FOR SEQ ID NO: 130: 
(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 1511 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

35 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 

CCTAGAACGT ATTCTGGAGA ATTGGGACCA ATGTGACACT CAGACGCTAA GAAAGAAACG 60 
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ATTTATATTC TTCTGCAGTA CCGCCTGGCC 
GCTTCCTGAG GGAAGTATAA ATTATAACAT 
GGAGGGCAAA TGGAGTGAAG TGCCATATGT 
ACAATTATGT AAAAAGTGTG GTTTATGCCC 
CCCCAGCGTC CCCTCCCCGA CTCCTTCCTC 
GGTCCAAAAG GAGATAGACA AAGGGGTAAA 
ATTATGCCCC CTCCAAGCAG TGAGAGGAGG 
TTTTTCTCTC TCAGACTTAA AGCAAATTAA 
TGACGGCTAT ATTGATGTTT TACAAGGGTT 
TATAATGTTA CTACTAAATC AGACACTAAC 
AGCCCGAGAG TTTGGCGATC TTTGGTATCT 
GGAAAGAACA ACTCCCACAG GCCAGCAGGC 
AGAATCAGAA CATGGAGATT GGTGCCACAA 
GAGGAAAACT AGGAAGAAGC CTATGAATTA 
GGAAGAAAAT CTTACTGCTT TTCTGGACAG 
CCTGTCACCT GACTCTATTG AAGGCCAACT 
AGCTGCAGAC ATTAGAAAAA ACTTCAAAAG 
ACCCTATTTA ACTTGGCATC CTCAGTTTTT 
CGGGACAAAC GGGATAAAAA AAAAAGGGGG 
AAGCAGACTT TGGAGGCTCT GCAAAAGGGA 
CTGGCTTCCA GTGCGGTCTA CAAGGACACT 
CGCCCCCTTG TCCATGCCCC TTACGTCAAG 
GATGAAGATA CTCTGAGTCA GAAGCCATTA 
GCCCGGGGCG AGCGCCAGCC CATGCCATCA 
TTGAGAGCCA A 



192 



ACAATATCCT 

nvnn x ** x w \* x 


CTTCAAGGGA 


GAGAAACCTG 


120 


CATCTTACAG 


CTAGACCTCT 


TCTGTAGAAA 


180 


GCAAACTTTC 


TTTTCATTAA 


GAGACAACTC 


240 


TACAGGAAGC 


CCTCAGAGTC 


CACCTCCCTA 


300 


AACTAATAAG 


GACCCCCCTT 


TAACCCAAAC 

X fin w w VlUMl w 


360 


CAATGAACCA 


AAGAGTGCCA 


ATATTCCCCG 


420 


AGAATTCGGC 


CCAGCCAGAG 


TGCCTGTACC 


480 


AATAGACCTA 


GGTAAATTCT 


CAGATAACCC 


540 


AGGACAATCC 


TTTGATCTGA 


CATGGAGAGA 


600 


CCCAAATGAG 


AGAAGTGCCG 


CTGTAACTGC 


660 


CAGTCAGGCC 


AACAATAGGA 


TGACAACAGA 


720 


AGTTCCCAGT 


GTAGACCCTC 


ATTGGGACAC 


780 


ACATTTGCTA 


ACTTGCGTGC 


TAGAAGGACT 


840 


CTCAATGATG 


TCCACTATAA 


CACAGGGAAA 


900 


ACTAAGGGAG 


GCATTGAGGA 


AG CAT AC CT C 


960 


AATCTTAAAG 


GATAAGTTTA 


TCACTCAGTC 


1020 


TCTGCCTTAG 


GCCCGGAGCA 


GAACTTAGAA 


1080 


TATAATAGAG 


ATCAGGAGGA 


GCAGGCGAAA 


1140 


GGTCCACTAC 


TTTAGTCATG 


GCCCTCAGGC 


1200 


AAAGCTGGGC 


AAATCAAATG 


CCTAATAGGG 


1260 


TTAAAAAAGA 


TTATCCAAGT 


AGAAATAAGC 


1320 


GGAATCACTG 


GAAGGCCCAC 


TGCCCCAGGG 


1380 


ACCAGATGAT 


CCAGCAGCAG 


GACTGAGGGT 


1440 


CCCTCACAGA 


GCCCCGGGTA 


TGTTTGACCA 


1500 
1511 



(2) INFORMATION FOR SEQ ID NO: 131: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 352 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 

Leu Glu Arg He Leu Glu Asn Trp Asp Gin Cys Asp Thr Gin Thr Leu 
15 10 15 
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Arg Lys Lys Arg Phe lie Phe Phe Cys Ser Thr Ala Trp Pro Gin Tyr 

20 25 30 

Pro Leu Gin Gly Arg Glu Thr Trp Leu Pro Glu Gly Ser lie Asn Tyr 
35 40 45 

5 Asn lie lie Leu Gin Leu Asp Leu Phe Cys Arg Lys Glu Gly Lys Trp 

50 55 60 

Ser Glu Val Pro Tyr Val Gin Thr Phe Phe Ser Leu Arg Asp Asn Ser 
65 70 75 80 

Gin Leu Cys Lys Lys Cys Gly Leu Cys Pro Thr Gly Ser Pro Gin Ser 
10 85 90 95 

Pro Pro Pro Tyr Pro Ser Val Pro Ser Pro Thr Pro Ser Ser Thr Asn 

100 105 110 

Lys Asp Pro Pro Leu Thr Gin Thr Val Gin Lys Glu lie Asp Lys Gly 
115 120 125 

15 Val Asn Asn Glu Pro Lys Ser Ala Asn lie Pro Arg Leu Cys Pro Leu 

130 135 140 

Gin Ala Val Arg Gly Gly Glu Phe Gly Pro Ala Arg Val Pro Val Pro 
145 150 155 160 

Phe Ser Leu Ser Asp Leu Lys Gin lie Lys lie Asp Leu Gly Lys Phe 
20 165 170 175 

Ser Asp Asn Pro Asp Gly Tyr lie Asp Val Leu Gin Gly Leu Gly Gin 

180 185 190 

Ser Phe Asp Leu Thr Trp Arg Asp lie Met Leu Leu Leu Asn Gin Thr 
195 200 205 

25 Leu Thr Pro Asn Glu Arg Ser Ala Ala Val Thr Ala Ala Arg Glu Phe 

210 215 220 

Gly Asp Leu Trp Tyr Leu Ser Gin Ala Asn Asn Arg Met Thr Thr Glu 
225 230 235 240 

Glu Arg Thr Thr Pro Thr Gly Gin Gin Ala Val Pro Ser Val Asp Pro 
30 245 250 255 

His Trp Asp Thr Glu Ser Glu His Gly Asp Trp Cys His Lys His Leu 

260 265 270 

Leu Thr Cys Val Leu Glu Gly Leu Arg Lys Thr Arg Lys Lys Pro Met 
275 280 285 

35 Asn Tyr Ser Met Met Ser Thr lie Thr Gin Gly Lys Glu Glu Asn Leu 

290 295 300 
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Thr Ala Phe Leu Asp Arg Leu Arg Glu Ala Leu Arg Lys His Thr Ser 
305 310 315 320 

Leu Ser Pro Asp Ser lie Glu Gly Gin Leu lie Leu Lys Asp Lys Phe 
325 330 335 

5 lie Thr Gin Ser Ala Ala Asp lie Arg Lys Asn Phe Lys Ser Leu Pro 

340 345 350 



(2) INFORMATION FOR SEQ ID NO: 132: 
(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 30 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 
15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 

TGCTGGAATT CGGGATCCTA GAACGTATTC 30 



(2) INFORMATION FOR SEQ ID NO: 133: 
(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 30 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 

AGTTCTGCTC CGAAGCTTAG GCAGACTTTT 30 



(2) INFORMATION FOR SEQ ID NO: 135: 
(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 398 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
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15 10 15 

Arg Gly Ser His Met Ala Ser Met Thr Gly Gly Gin Gin Met Gly Arg 

20 25 30 

lie Leu Glu Arg lie Leu Glu Asn Trp Asp Gin Cys Asp Thr Gin Thr 
5 35 40 45 

Leu Arg Lys Lys Arg Phe lie Phe Phe Cys Ser Thr Ala Trp Pro Gin 

50 55 60 

Tyr Pro Leu Gin Gly Arg Glu Thr Trp Leu Pro Glu Gly Ser lie Asn 
65 70 75 80 

10 Tyr Asn lie lie Leu Gin Leu Asp Leu Phe Cys Arg Lys Glu Gly Lys 

85 90 95 

Trp Ser Glu Val Pro Tyr Val Gin Thr Phe Phe Ser Leu Arg Asp Asn 

100 105 110 

Ser Gin Leu Cys Lys Lys Cys Gly Leu Cys Pro Thr Gly Ser Pro Gin 
15 115 120 125 

Ser Pro Pro Pro Tyr Pro Ser Val Pro Ser Pro Thr Pro Ser Ser Thr 

130 135 140 

Asn Lys Asp Pro Pro Leu Thr Gin Thr Val Gin Lys Glu lie Asp Lys 
145 150 155 160 

20 Gly Val Asn Asn Glu Pro Lys Ser Ala Asn lie Pro Arg Leu Cys Pro 

165 170 175 

Leu Gin Ala Val Arg Gly Gly Glu Phe Gly Pro Ala Arg Val Pro Val 

180 185 190 

Pro Phe Ser Leu Ser Asp Leu Lys Gin lie Lys lie Asp Leu Gly Lys 
25 195 200 205 

Phe Ser Asp Asn Pro Asp Gly Tyr lie Asp Val Leu Gin Gly Leu Gly 

210 215 220 

Gin Ser Phe Asp Leu Thr Trp Arg Asp He Met Leu Leu Leu Asn Gin 
225 230 235 240 

30 Thr Leu Thr Pro Asn Glu Arg Ser Ala Ala Val Thr Ala Ala Arg Glu 

245 250 255 

Phe Gly Asp Leu Trp Tyr Leu Ser Gin Ala Asn Asn Arg Met Thr Thr 

260 265 270 

Glu Glu Arg Thr Thr Pro Thr Gly Gin Gin Ala Val Pro Ser Val Asp 
35 275 280 285 

Pro His Trp Asp Thr Glu Ser Glu His Gly Asp Trp Cys His Lys His 
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290 295 300 

Leu Leu Thr Cys Val Leu Glu Gly Leu Arg Lys Thr Arg Lys Lys Pro 
305 310 315 320 

Met Asn Tyr Ser Met Met Ser Thr lie Thr Gin Gly Lys Glu Glu Asn 
5 325 330 335 

Leu Thr Ala Phe Leu Asp Arg Leu Arg Glu Ala Leu Arg Lys His Thr 

340 345 350 

Ser Leu Ser Pro Asp Ser lie Glu Gly Gin Leu lie Leu Lys Asp Lys 
355 360 365 

10 Phe lie Thr Gin Ser Ala Ala Asp He Arg Lys Asn Phe Lys Ser Leu 

370 375 380 

Pro Lys Leu Ala Ala Ala Leu Glu His His His His His His 
385 390 395 

15 (2) INFORMATION FOR SEQ ID NO: 137: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 378 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 

Met Ala Ser Met Thr Gly Gly Gin Gin Met Gly Arg He Leu Glu Arg 
15 10 15 

25 He Leu Glu Asn Trp Asp Gin Cys Asp Thr Gin Thr Leu Arg Lys Lys 

20 25 30 

Arg Phe He Phe Phe Cys Ser Thr Ala Trp Pro Gin Tyr Pro Leu Gin 

35 40 45 

Gly Arg Glu Thr Trp Leu Pro Glu Gly Ser He Asn Tyr Asn He He 
30 50 55 60 

Leu Gin Leu Asp Leu Phe Cys Arg Lys Glu Gly Lys Trp Ser Glu Val 
65 70 75 80 

Pro Tyr Val Gin Thr Phe Phe Ser Leu Arg Asp Asn Ser Gin Leu Cys 
85 90 95 

35 Lys Lys Cys Gly Leu Cys Pro Thr Gly Ser Pro Gin Ser Pro Pro Pro 

100 105 110 
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Tyr Pro Ser Val Pro Ser Pro Thr Pro Ser Ser Thr Asn Lys Asp Pro 

115 120 125 

Pro Leu Thr Gin Thr Val Gin Lys Glu lie Asp Lys Gly Val Asn Asn 
130 135 140 

5 Glu Pro Lys Ser Ala Asn lie Pro Arg Leu Cys Pro Leu Gin Ala Val 

145 150 155 160 

Arg Gly Gly Glu Phe Gly Pro Ala Arg Val Pro Val Pro Phe Ser Leu 

165 170 175 

Ser Asp Leu Lys Gin lie Lys lie Asp Leu Gly Lys Phe Ser Asp Asn 
10 180 185 190 

Pro Asp Gly Tyr lie Asp Val Leu Gin Gly Leu Gly Gin Ser Phe Asp 

195 200 205 

Leu Thr Trp Arg Asp lie Met Leu Leu Leu Asn Gin Thr Leu Thr Pro 
210 215 220 

15 Asn Glu Arg Ser Ala Ala Val Thr Ala Ala Arg Glu Phe Gly Asp Leu 

225 230 235 240 

Trp Tyr Leu Ser Gin Ala Asn Asn Arg Met Thr Thr Glu Glu Arg Thr 

245 250 255 

Thr Pro Thr Gly Gin Gin Ala Val Pro Ser Val Asp Pro His Trp Asp 
20 260 265 270 

Thr Glu Ser Glu His Gly Asp Trp Cys His Lys His Leu Leu Thr Cys 

275 280 285 

Val Leu Glu Gly Leu Arg Lys Thr Arg Lys Lys Pro Met Asn Tyr Ser 
290 295 300 

25 Met Met Ser Thr lie Thr Gin Gly Lys Glu Glu Asn Leu Thr Ala Phe 

305 310 315 320 

Leu Asp Arg Leu Arg Glu Ala Leu Arg Lys His Thr Ser Leu Ser Pro 

325 330 335 

Asp Ser lie Glu Gly Gin Leu lie Leu Lys Asp Lys Phe lie Thr Gin 
30 340 345 350 

Ser Ala Ala Asp lie Arg Lys Asn Phe Lys Ser Leu Pro Lys Leu Ala 

355 360 365 

Ala Ala Leu Glu His His His His His His 
370 375 



35 



(2) INFORMATION FOR SEQ ID NO: 138: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: 
CTTGGAGGGT GCATAACCAG GGAAT 25 



10 (2) INFORMATION FOR SEQ ID NO: 139: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
15 (D ) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 
TGTCCGCTGT GCTCCTGATC 20 



20 (2) INFORMATION FOR SEQ ID NO: 140: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: 
CTATGTCCTT TTGGACTGTT TGGGT 25 



30 (2) INFORMATION FOR SEQ ID NO: 141: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 764 base pairs 

( B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 



TGTCCGCTGT 


GCTCCTGATC 


CAGCACAGGC 


GCCCATTGCC 


TCTCCCAATT 


GGGCTAAAGG 


60 


CTTGCCATTG 


TTCCTGCACA 


GCTAAGTGCC 


TGGGTTCATC 


CTAATCGAGC 


TGAACACTAG 


120 


TCACTGGGTT 


CCACGGTTCT 


CTTCCATGAC 


CCATGGCTTC 


TAATAGAGCT 


ATAACACTCA 


180 


CTGCATGGTC 


CAAGATTCCA 


TTCCTTGGAA 


TCCGTGAGAC 


CAAGAACCCC 


AGGTCAGAGA 


240 


ACACAAGGCT 


TG CCACCAIO 






111 I&uAAGL 






TATCTTGGGA 


GCTCTGGGAG 


CAAGGACCCC 


AGGTAACAAT 


TTGGTGACCA 


CGAAGGGACC 


360 


TGAATCCGCA 


ACCATGAAGG 


GATCTCCAAA 


GCAATTGGAA 


ATGTTCCTCC 


CAAGGCAAAA 


420 


ATGCCCCTAA 


GATGTATTCT 


GGAGAATTGG 


GACCAATTTG 


ACCCTCAGAC 


AGTAAGAAAA 


480 


AAATGACTTA 


TATTCTTCTG 


CAGTACCGCC 


CTGGCCACGA 


TATCCTCTTC 


AAGGGGGAGA 


540 


AACCTGGCCT 


CCTGAGGGAA 


GTATAAATTA 


TAACACCATC 


TTACAGCTAG 


ACCTGTTTTG 


600 


TAGAAAAGGA 


GGCAAATGGA 


GTGAAGTGCC 


ATATTTACAA 


ACTTTCTTTT 


CATTAAAAGA 


660 


CAACTCGCAA 


TTATGTTAAC 


AGTGTGATTT 


GTGTTCCTAC 


ACGGAAGCCC 


TCAGATTCTA 


720 


CTCCCCACCC 


CCGGCATCTC 


CCCTGAATCC 


CTCCCCAACT 


TATT 




764 



15 

<2) INFORMATION FOR SEQ ID NO: 142: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 800 base pairs 

(B) TYPE: nucleotide 

20 (C) STRANDEDNESS: single 



(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 





TGTCCGCTGT 


GCTCCTGATC 


CAGCACAGGC 


GCCCATTGCC 


TCTCCCAATT 


GGGCTAAAGG 


60 


25 


CTTGCCATTG 


TTCCTGCACA 


GCTAAGTGCC 


TGGGTTCATC 


CTAATCGAGC 


TGAACACTAG 


120 




TCACTGGGTT 


CCACGGTTCT 


CTTCCATGAC 


CCATGGCTTC 


TAATAGAGCT 


ATAACACTCA 


180 




CTGCATGGTC 


CAAGATTCCA 


TTCCTTGGAA 


TCCGTGAGAC 


CAAGAACCCC 


AGGTCAGAGA 


240 




ACACAAGGCT 


TGCCACCATG 


TTGGAAGCAG 


CCCACCACCA 


TTTTGGAAGC 


GGCCCGCCAC 


300 




TATCTTGGGA 


GCTCTGGGAG 


CAAGGACCCC 


CAGGTAACAA 


TTTGGTGACC 


ACGAAGGGAC 


360 


30 


CTGAATCCGC 


AACCATGAAG 


GGATCTCCAA 


AGCAATTGGA 


AATGTTCCTC 


CCAAGGCAAA 


420 




AATGCCCCTA 


AGATGTATTC 


TGGAGAATTG 


GGACCAATCT 


GACCCTCAGA 


CAGTAAGAAA 


480 




AAAAATGACT 


TATATTCTTC 


TGCAGTACCG 


CCTGGCCACG 


GATATCCTCT 


TCAAGGGGGA 


540 




GAAACCTGGC 


CTCCTGAGGG 


AAGTATAAAT 


TATAACACCA 


TCTTACAGCT 


AGACCTGTTT 


600 




TGTAGAAAAG 


GAGGCAAATG 


GAGTGAAGTG 


CCATATTTAC 


AAACTTTCTT 


TTCATTAAAA 


660 


35 


GACAACTCGC 


AATTATGTAA 


ACAGTGTGAT 


TTGTGTCCTA 


CAGGAAGCCC 


TCAGATCTAC 


720 




CTCCCTACCC 


CGGCATCTCC 


CTGACTCCTT 


CCCCAACTAA 


TAAGGACCCA 


CTTCAGCCCA 


780 
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AACAGTCCAA AAGGACATAG 800 



(2) INFORMATION FOR SEQ ID NO: 169: 
(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169: 

consensus (41/68-1 + 42/68-1 + cl43 68-1) 



(2) INFORMATION FOR SEQ ID NO: 170: 
(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 438 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
<ii) TYPE DE MOLECULE: ADNc 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170: 

GACTTGAGCC AGTCCTCATA CCTGGACACT CTTGTCCTTC GGTACATGGA TGATTTACTT 60 
TTAGCCACCC ATTCAGAAAC CTTGTGCCAT CAAGCCACCC AAGCACTCTT AAATTTCCTT 120 
GCTACCTGTG GCTACAAGGT TTCCAAACCA AAGGCTCAGC TCTGCTCACA GCAGGTTAAA 180 
TACTTAGGGC TAAAATTATC CAAAGGCACC AGAACCCTCA GTGAGGAACG TATCCAGCCT 240 

25 ATACTGGGTT ATCCTCATCC CAAAACCCTA AAGCAACTAA CAGCGTTCCT TGGCATAACA 300 
GGTTTCTGCC AAATATGGAT TCCCAGGTAC AGCAAGATAG CCAGACCATT AAATACACGA 360 
ATTAAGGAAA CTCAAAAAGC CAATACCCAT TTAGTAAGAT GGACACCTGA AGCAGAAGTG 420 
GCTTTCCAGG CCCTAAAG 438 



30 (2) INFORMATION FOR SEQ ID NO: 171: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 438 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171: 

GACTTGAGCC AGTCCTCATA CCTGGACACT CTTGTCCTTC GGTACATGGA TGATTTACTT 60 

TTAGCCACCC ATTCAGAAAC CTTGTGCCAT CAAGCCACCC AAGCACTCTT AAATTTCCTT 120 

GCTACCTGTG GCTACAAGGT TTCCAAACCA AAGGCTCAGC TCTGCTCACA GCAGGTTAAA 180 

TACTTAGGGC TAAAATTATC CAAAGGCACC AGAACCCTCA GTGAGGAACG TATCCAGCCT 240 

ATACTGGGTT ATCCTCATCC CAAAACCCTA AAGCAACTAA CAGCGTTCCT TGGCATAACA 300 

GGTTTCTGCC AAATATGGAT TCCCAGGTAC AGCAAAGTAG CCAGACCATT AAATACACGA 360 

ATTAAGGAAA CTCAAAAAGC CAGTACCCAT TTAGTAAGAT GGACACCTGA AGCAGAAGTG 400 

GCTTTCCAGG CCCTAAAG 438 



(2) INFORMATION FOR SEQ ID NO: 172: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 438 base pairs 
<B) TYPE: nucleotide 

15 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172: 
GACTTGAGCC AGTCYTCATA CCTGGACAYT CTTGTCCTTC GGTACATGGA TGATTTACTT 60 
20 TTAGCCACCC ATTCAGAAAC CTTGTGCCAT CAAGCCACCC AAGCACTCTT AAATTTCCTT 120 
GCTACCTGTG GCTACAAGGT TTCCAAACCA AAGGCTCAGC TCTGCTCACA GCAGGTTAAA 180 
TACTTAGGGC TAAAATTATC CAAAGGCACC AGAACCCTCA GTGAGGAACG TATCCAGCCT 240 
ATACTGGGTT ATCCTCATCC CAAAACCCTA AAGCAACTAA CAGCGTTCCT TGGCATAACA 300 
GGTTTCTGCC AAATATGGAT TCCCAGGTAC AGCAAAATAG CCAGACCATT AAATACACGA 360 
25 ATTAAGGAAA CTCAAAAAGC CAATACCCAT TTAGTAAGAT GGACATCTGA AGCAGAAGTG 400 
GCTTTCCAGG CCCTAAAG 438 

(2) INFORMATION FOR SEQ ID NO: 173: 
(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 146 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 173: 

DLSQSSYLDT LVLRYMDDLL LATHSETLCH QATQALLNFL ATCGYKVSKP 50 
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KAQLCSQQVK YLGLKLSKGT RTLSEERIQP ILGYPHPKTL KQLTAFLGIT 100 
GFCQIWIPRY SKIARPLNTR IKETQKANTH LVRWTPEAEV AFQALK 146 

(2) INFORMATION FOR SEQ ID NO: 174: 
5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 146 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

10 (ii) TYPE DE MOLECULE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:174: 
DLSQSSYLDT LVLRYMDDLL LATHSETLCH QATQALLNFL ATCGYKVSKP 50 
KAQLCSQQVK YLGLKLSKGT RTLSEERIQP ILGYPHPKTL KQLTAFLGIT 100 
GFCQIWIPRY SKVARPLNTR IKETQKASTH LVRWTPEAEV AFQALK 146 

15 

(2) INFORMATION FOR SEQ ID NO: 175: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 146 amino acids 

(B) TYPE: amino acid 

20 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 175: 
DLSQSSYLDX LVLRYMDDLL LATHSETLCH QATQALLNFL ATCGYKVSKP 50 
2 5 KAQLCSQQVK YLGLKLSKGT RTLSEERIQP ILGYPHPKTL KQLTAFLGIT 100 
GFCQIWIPRY SKIARPLNTR IKETQKANTH LVRWTSEAEV AFQALK 146 

(2) INFORMATION FOR SEQ ID NO: 176: 
(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 176: 

consensus ( 1/46-7+8/46-7+C15/46/7 ) 
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(2) INFORMATION FOR SEQ ID NO: 177: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 429 base pairs 
5 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177: 

10 GACTTGAGCC AGTCCTCATA CCTGGACATT CTTGTTCTTC AGTATGGGGA TGACTTAATT 60 
ATAGCCACCC ATTCAGAAAC CTTGTGGCAT CAAGCCACCC AAGCGCTCTT AAATTTCCTT 120 
GCTACCTGTG GCTCCAAACA AAAGGCTCAC CTCTGCTCAC ACCAGGTTAA ATACTTAGGG 180 
CTAAAATTAT CCAAAGTCAC CAGGGCCCTC AGAGAGGAAC GTATCCAGCG TATACTGGCT 240 
TATCCTCATC CCATAACCCT AAAGCAACTA AGAGGGTTCC TTGGCATATC AGCCTTCTGC 300 

15 CGAATATGGA TTCCCGGATA CAGTGAAATA GCCAGGCCAT TATGTACATT AATTAAGGAA 360 
ACTCAGAAAG CCAATACCCA TATAGTAAGA TGGACACCTG AAACAGAAGT GGCTTTCCAG 420 
GCCCTAAAG 429 

(2) INFORMATION FOR SEQ ID NO: 178: 
20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 429 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
25 (ii) TYPE DE MOLECULE: ADNc 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 178: 

GACTTGAGCC AGTCCTCATA CCTGGACATT CTTGTTCTTC AGTATAGGGA TGATTTAATT 60 

ATAGCCACCC ATTCAGAAAC CTTGTGGCAT CAAGCCACCC AAGTGCTCTT AAATTTCCTC 120 

GCTACCTGTG GCTCCAAACA AAGGGCTCAG CTCTGCTCAC AGCAGGTTAA ATACTTAGGG 180 

30 CTAAAATTAT CCAAAGTCGC CAGGGCCCTC AGAGAGGAAC GTATCCAGCG TATACTGGAT 240 

TATCCTCATC CCAAAACCAT AAAGCAACTA AGAGGGTTCC TTGGCATAAC AGCCTTCTGC 300 

CGAATATGGA TTCCCGGATA CAGTGAAATA GCCAGGCCAT TATGTACATT AGTTAAGGAA 360 

ACTCAGAAAG CCAATACCCA TATAGTAAGA TGGACACCTG AGACAGAAGT GGCTTTCCAG 420 

GCCCTAAAG 429 

35 

(2) INFORMATION FOR SEQ ID NO: 179: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 429 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
5 (D) TOPOLOGY: linear 

<ii) TYPE DE MOLECULE: ADNc 
<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179: 
GACTTGAGCC AGTCCTCATA CCTGGACATT CTTGTTCCTC AGTATGGGGA TGATTTAATT 60 
ATAGCCACCC ATTCAGAAAC CTTGTGGCAC CAAGCCACCC AAGCGCTCTT AAATTTCCTC 120 

10 GCTACCTGTG GCTCCAAACA AAAGGCTCAG CTCTGCTCAC AGCAGGTTAA ATACTTAGGG 180 
CTAAAATTAT CCAAAGTCAC CAGGGCCCTC AGAGAGGAAC GTATCCAGCG TATACTGGCT 240 
TATCCCCATC CCAAAACCCT AAAGCAACTA AGARGGTTCC TTGGCATAAC AGCCTTCTGC 300 
CGAATATGGA TTCCCAGATA CAGCGAAATA GCCAGGCCAT TATGTACATT ATCTAAGGAA 360 
ACTCAGAAAG CCAATACCCA TATAGTAAGA TGGACACCTG AAACAGAAGT GGCTTTCCAG 420 

15 GCCCTAAAG 429 

(2) INFORMATION FOR SEQ ID NO: 180: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 143 amino acids 
20 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 180: 

25 DLSQSSYLDI LVLQYGDDLI IATHSETLWH QATQALLNFL ATCGSKQKAH 50 

LCSHQVKYLG LKLSKVTRAL REERIQRILA YPHPITLKQL RGFLGISAFC 100 

RIWIPGYSEI ARPLCTLIKE TQKANTHIVR WTPETEVAFQ ALK 143 

(2) INFORMATION FOR SEQ ID NO: 181: 
30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 143 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 (ii) TYPE DE MOLECULE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181: 
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DLSQSSYLDI LVLQYRDDLI IATHSETLWH QATQVLLNFL ATCGSKQRAQ 50 

LCSQQVKYLG LKLSKVARAL REERIQRILD YPHPKTIKQL RGFLGITAFC 100 

RIWIPRYSEI ARPLCTLVKE TQKANTHIVR WTPETEVAFQ ALK 143 

5 (2) INFORMATION FOR SEQ ID NO: 182: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 143 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
10 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182: 
DLSQSSYLDI LVPQYGDDLI IATHSETLWH QATQALLNFL ATCGSKQKAQ 
LCSQQVKYLG LKLSKVTRAL REERIQRILA YPHPKTLKQL RXFLGITAFC 
15 RIWIPRYSEI ARPLCTLSKE TQKANTHIVR WTPETEVAFQ ALK 

(2) INFORMATION FOR SEQ ID NO: 183: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 25 base pairs 
20 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 183: 

25 GGCCAGGCAT CAGCCCAAGA CTTGA 25 

(2) INFORMATION FOR SEQ ID NO: 184: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 22 base pairs 
30 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:184: 
35 TGCAAGCTCA TCCCTSRGAC CT 22 



50 
100 
143 
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(2) INFORMATION FOR SEQ ID NO; 185: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185: 
GACTTGAGCC AGTCCTCATA CCT 23 



10 



(2) INFORMATION FOR SEQ ID NO: 186: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleotide 

15 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186: 
CTTTAGGGCC TGGAAAGCCA CT 22 

20 
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TABLE No, 5 
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TABLE NO. 6 
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CLAIMS 

1. Nucleic material , in the isolated or 
purified state, comprising a nucleotide sequence selected 

5 from the group including sequences SEQ ID NO: 93, SEQ ID 
NO: 94, their complementary sequences and their equivalent 
sequences, in particular nucleotide sequences displaying, 
for any succession of 100 contiguous monomers, at least 
50% and preferably at least 60% homology with said 
sequence SEQ ID NO: 93, SEQ ID NO: 94 and their 
complementary sequences, excluding HSERV-9 sequence. 

2. Nucleic material of claim 1, nucleotide 
sequence of which is selected from the group including 
sequences SEQ ID NO: 93, SEQ ID NO: 94, their complementary 
sequences and their equivalent sequences, in particular 
nucleotide sequences displaying, for any succession of 100 
contiguous monomers, at least 70% and preferably at least 
80% homology with said sequence SEQ ID NO: 93, SEQ ID NO: 94 
and their complementary sequences. 

3. Nucleic material, in the isolated or 
purified state, coding for any polypeptide displaying, for 
any contiguous succession of at least 30 amino acids, at 
least 50%, preferably at least 60 %, and most preferably 
at least 70% homology with a peptide sequence encoded by 
any nucleotide sequence selected from the group including 
SEQ ID NO: 93, SEQ ID NO: 94 and their complementary 
sequence. 

4. Nucleic material, in the isolated or 
purified state, of retroviral type, comprising a 
nucleotide sequence identical or equivalent to at least 
part of the pol gene of an isolated retrovirus associated 
with multiple sclerosis or rheumatoid arthritis. 

5. Nucleic material as claimed in claim 4, 
said nucleotide sequence being 80 % homologous to said at 
least part of the pol gene. 
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6. Nucleic material comprising a nucleotide 
sequence identical or equivalent to at least part of the 
pol gene of an isolated virus encoding a reverse 
transcriptase comprising an enzymatic site comprised 

5 between the amino acid domains LPQG and YXDD, said virus 
having a phylogenic distance with HSERV-9 of 0.063 + 0.1, 
and preferably 0.063 + 0.05. 

7. Nucleotide fragment comprising a nucleotide 
sequence selected from the group including SEQ ID NO: 93 , 

10 SEQ ID NO: 94, their complementary sequences and their 
equivalent sequences, in particular nucleotide sequences 
displaying, for any succession of 100 contiguous monomers, 
at least 50% and preferably at least 60% homology with 
said sequences and their complementary sequences, said 

15 group excluding SEQ ID NO:l, and said nucleotide fragment 
not comprising nor consisting of the sequence HSERV-9. 

8. Nucleotide fragment of claim 7, nucleotide 
sequence of which is selected from the group including SEQ 
ID NO: 93, SEQ ID NO: 94, their complementary sequences and 

20 their equivalent sequences, in particular nucleotide 
sequences displaying, for any succession of 100 contiguous 
monomers, at least 70% and preferably at least 80% 
homology with said sequences and their complementary 
sequences. 

25 9. Nucleotide fragment comprising a coding 

nucleotide sequence which is at least partially identical 
to a nucleotide sequence selected from the group 
including : 

SEQ ID NO: 93, SEQ ID NO: 94; their complementary 
30 sequences ; their equivalent sequences, in particular 
homologous to SEQ ID NO: 93, SEQ ID NO: 94; 

sequences encoding at least part of the peptide 
sequence defined by SEQ ID NO: 95; 

sequences encoding at least part of a peptide 
35 sequence equivalent, in particular homologous to SEQ ID 
NO: 95, which is capable of being recognized by sera of 
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patients infected with the MSRV-1 virus, or in whom the 
MSRV-1 virus has been reactivated. 

10. Nucleic acid probe for the detection of a 
virus associated with multiple sclerosis or rheumatoid 

5 arthritis, characterized in that it is capable of 
hybridizing specifically with any fragment according to 
any one of claim 7 to 9. 

11. Probe as claimed in claim 10, consisting of 
between 10 and 1,000 monomers. 

10 12. Primer for the amplification by 

polymerization of an RNA or a DNA of a viral material 
associated with multiple sclerosis or rheumatoid 
arthritis, comprising a nucleotide sequence identical or 
equivalent to at least one portion of the nucleotide 

15 sequence of a fragment as claimed in any one of claims 7 
to 9, in particular a nucleotide sequence displaying, for 
any succession of at least 10 contiguous monomers, 
preferably 15 contiguous monomers, more preferably 18 
contiguous monomers and even most preferably 20 contiguous 

20 monomers, at least 70% homology with at least the said 
portion of the said fragment. 

13. Primer as claimed in Claim 12, comprising a 
sequence selected from the group consisting of SEQ ID NO: 
99 to SEQ ID NO: 111. 

25 14. Polypeptide encoded by any open reading 

frame belonging to a nucleotide fragment as claimed in any 
one of claims 7 to 9. 

15. Polypeptide of claim 14, characterized in 
that the open reading frame encoding it, is comprised, in 

30 the 5' -3* direction, between nucleotide 18 and nucleotide 
2304 Of SEQ ID NO:93. 

16. Polypeptide according to claim 15, 
comprising a peptide sequence at least partially identical 
to SEQ ID NO: 95. 

35 17. Polypeptide, comprising a peptide sequence 

at least partially identical to SEQ ID NO: 96. 
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18. Polypeptide of claim 17 exhibiting an 
enzymatic activity consisting of proteolytic activity. 

19. Polypeptide, characterized in that the open 
reading frame encoding it begins, in the 5^3* direction, 

5 at nucleotide 18 and ends at nucleotide 340 of SEQ ID 
NO:93. 

20. Polypeptide exhibiting an inhibitory 
activity on the proteolytic activity of polypeptide of 
claim 18. 

10 21. Polypeptide, comprising a peptide sequence 

identical or equivalent to SEQ ID NO: 97. 

22. Polypeptide of claim 21, comprising a 
peptide sequence identical or equivalent to SEQ ID NO: 98. 

23. Polypeptide, characterized in that the open 
15 reading frame encoding it begins, in the 5'-3' direction, 

at nucleotide 341 and ends at nucleotide 2304 of SEQ ID 
NO: 93. 

24. Polypeptide, characterized in that the open 
reading frame encoding it begins, in the 5 f -3' direction, 

0 at nucleotide 1858 and ends at nucleotide 2304 of SEQ ID 
NO:93. 

25. Polypeptide of claim 21 or 23, exhibiting a 
reverse transcriptase activity. 

26. Polypeptide of claim 22 or 24, exhibiting a 
5 ribonuclease H activity. 

27. Polypeptide exhibiting an inhibitory 
activity on the reverse transcriptase activity of 
polypeptide of claim 25. 

28. Polypeptide having an inhibitory activity 
0 on the ribonuclease H activity of polypeptide of claim 26. 

29. Antigenic polypeptide recognized from the 
sera of patients infected with the MSRV-1 virus, and/or in 
whom the MSRV-1 virus has been reactivated, characterized 
in that its peptide sequence is at least partially 

5 identical or is equivalent to a sequence selected from the 
group consisting of SEQ ID NO: 95, and fragments thereof, 
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in particular SEQ ID NO: 96, SEQ ID NO: 97 and SEQ ID NO: 
98. 

30. Mono- or polyclonal antibody directed 
against the MSRV-1 virus , characterized in that it is 

5 obtained by the immunological reaction of a human or 
animal body or cells to an immunogenic agent consisting of 
an antigenic polypeptide of claim 29. 

31. Reagent for detection of the MSRV-1 
virus, or of an exposure to the said virus, characterized 

10 in that it comprises at least one reactive substance 
selected from the group consisting of a probe as claimed 
in claim 10 or 11 ; a polypeptide as claimed in any one of 
claims 14 to 29 ; or an antibody as claimed in claim 30. 

32. Diagnostic, prophylactic or therapeutic 
15 composition, in particular for inhibiting the expression 

of a virus associated with multiple sclerosis or 
rheumatoid arthritis, and/or the enzymatic activity of the 
proteins of said virus, said composition comprising a 
nucleotide fragment of any one of claims 7 to 9. 

20 33. Diagnostic, prophylactic or therapeutic 

composition comprising a polypeptide of any one of claims 
14 to 29, or an antibody of claim 30. 

34. Process for detecting a virus associated 
with multiple sclerosis or rheumatoid arthritis, in a 

25 biological sample, characterized in that an RNA and/or a 
DNA presumed to belong or originating from said virus, or 
their complementary RNA and/or DNA, is/are brought into 
contact with a nucleotide fragment according to any one of 
claim 7 to 9. 

30 35. Process for detecting the presence or 

exposure to a virus associated with multiple sclerosis or 
rheumatoid arthritis, in a biological sample, wherein said 
sample is brought into contact with a polyeptide, 
according to any one of claim 14 to 29, or an antibody of 

35 claim 30. 
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FIG. 1 

GTTEB03GAT MC0CIOVTC TCTTTQCTCA GGraCIGOOC CARGATCTOG 
GOCACTICIC W3STOCAGSN ACICTGTYOC TICAG 8S 

S£0 ID N03 < p OL MSRv-iu) 

GTICW33GAT AQ0000CB3C "DVXITOQOCA G3CSCXRGCT CAA3SCTTGA 
GCCBGTICIC AmXTOSSC XYTCMGOTC TTOOtST 86 

SEQ ID NO J, < POL msrv-ib) 

CnCKBPCKT M30CXXXIC TSHTOQ0Qtf FfflGVXTMOC CMGBCXIGA. S3 
GW3«nCTC AOCTESIDC TKBG 85 

S£tf /P N05^ L MSRV-1B) 

OXfiGlTCK JflMGTOGSC idClUGUX 85 

5£0 /J> N06 (POL MSRV-1B) 



Cbosensus 
Consensus 
Consensus 



GTOTCGOCRC A3333T1XRR EGKENOO' CKK3KTTIG GXCWBGTfflYT 
PPCXCRAKRY YTRRGTCAVr lOTAKRVST FGSNMTrClB KXOCTnBGT 
ACATOGATCA c 

/X» «oy (pol msrv-ib) 



5/9/2006, EAST Version: 2.0.3.0 



WO 98/23755 PCT/IB97/01482 



ziee 



Fig. 2 

CONSENSUS A seq id no 3 

GTTTAGGGATAGCCC TCATCTC7TTGGTCA GGTAaGGCCGAAGA TCTAGGCCACTTCTC 

- o <: <; l W S GTGPK :> * r 

Vr O S P H L F G Q V L A Q- 0 L G H f * 
L G I A L I S L V .R Y W P K I A T S Q 

AGGTCCAGGCACTCT GTTCCTTCAG 
RSRHS VP. S 
GPGTL FLQ 
V Q A L C S F 



60 



8S 



CONSENSUS B seq id no <t 

GTTCAGGGATAGCCC CCATCTATTTGGCCA GGCACTAGCTCAATA CTTGAGCCAGITCTC 60 



PSIWP GTSSI LEPVL 
P 3 k L A Q Y L S Q F 5 

■ S 'g"i"a P I Y L A R H - L N T A S S H 

ATAGCTGGACACTCT TGTCCTTCGGT 
I P G H S CPS 
Y L 0 T L V L R 
TWTLL SFG 



86 



CONSENSUS C seq id no 5 mrKrrrLLrrcTC 

GTTCAGGGATAGCCC CCATCTATTTGGCCA GGCATTAGCCCAAGA CTTGAGTCAATTCTC 

Vv.vA'.v.'.v.v.v'.v.v. 

ATACCTGGACACTCT TGTCCTTCAG 
IPGHS CPS 
YLOTL VLQ 
TWTLL S F 

CONSENSUS D seq id no 6 

GTTCAGGGATAGCTC CCATCTATTTGGCCT GGCATTAACCCGAGA CTTAAGCCAGTTCTC 

•VrVs l s Vl^Vl v\V°t l Vvs s h 

S G I A P I Y L A W H . P E T 

ATACGTGGACACTCT TGTCCTTTGG 
IRGHS CPL 
Y V 0 T L V L W 
TWTLL S F 
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60 



85 
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FlG.3 

Consensus TTOGKTOCRG TCYTO3CACA GQQ33CTCSA Q3CT&3OQ0G TOCaGTTOOC SO 
Cbnsensus GGMOXECC THmQOCTCT AOGTOGMEA OLUiLTllARG CTTGRG 96 

S££ ID MO H 
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FiG. 6 

CAAQCCACCC AAGAACICTT AAATTTCCTC ACTACCTCTC GCTACAAQGT 50 

TTDCAAAOCA AAQQCICAQC TCIGCICACA GGAGATTAGA T7CnM9GT 100 

TAAAATTATC CAAAGQCACC AQQQQQCICA GTGA3GAAQG TATOCAGXT 150 

ATACTOQGTT ATOCTCATOC CAAAAOOCTA AAGCAACTAA GAGOGTICCT 200 

TAQCATCATC AGGTTICTQC OGAAAACAAG ATTOOCAGGT ACAAOCAAAA 250 

ATTATMACA CIAATI^AGG AAACICAGAA AGCTAATOOC 300 

TATTTAGTAA GATOSACAOC TAAACAGAAG QCTTICCAGG OOCTAAAGAA 350 

G900CIAAOC CAAGOOOCAG TGTTCAGCTT GOCAACAQQS CAAGATTTIT 400 

CTTIATATGG CACAGAAAAA ACAGGAATDG CTCIAQGAGT CCTIACACAG 450 

GTCOGAG3GA TCAQCTTQCA AGCOGTO9CA TAOCTGAATA AGGAAATIGA 500 

TCTAGTQ32A AAQQCTTQQC! CICATOGTTT A1G32TAA1G <2^33CAGEAG 550 

CAGICINSGT ATCIGAAGCA GTIAAAATAA TACAQ9GAAG AGATCTTNTT 600 

GTGTQGACAT CICA1GA1GT GAAOQ9CA.TA CICAC1GCXA AAGGAC50T 650 

GIQSTTCICA GACAAOCATT TACTIAANIA TCAQ3CTCTA TIACTTCAAG 700 

AGOCAGTOCT QS3ACTQ03C ACTIG1QZAA CICTIAAAOC C 741 

S£tf A0 (PSJ 17) 
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TCAGGGATAGCCCCCATCTATTTGGCCAGGCATTAGCGCAAGACTTGAGTC 

AATTCTCATACCTGGACACTCTTGTCCTTCAGTACATGGATGATTTACTTT 

TAGTCGCCCGTTCAGAAACCTTGTGCCATCAAGCCACCCAAGAACTCTTAA 

CTTTCCTCACTACCTGTGGCTACAAGGTTTCCAAACCAAAGGCTCGGCTCT 

GCTCACAGGAGATTAGATACTNAGGGCTAAAATTATCCAAAGGCACCAGG 

GCCCTCAGTGAGGAACGTATCCAGCCTATACTGGCTTATCCTCATCCCAAA 

ACCCTAAAGCAACTAAGAGGGTTCCTTGGCATAACAGGTTTCTGCCGAAA 

ACAGATTCCCAGGTACASCCCAATAGCCAGACCATTATATACACTAATTA 

NGGAAACTCAGAAAGCCAATACCTATTTAGTAAGATGGACACCTACAGAA 

GTGGCnTTCCAGGCCCTAAAGAAGGCCCTAACCCAAGCCCCAGTGTTCAGC 

TTGCCAACAGGGCAAGA 1 mTCTl I ATATGCCACAGAAAAAACAGGAAT 
AGCTCTAGGAGrrCCTTACGCAGGTCTCAGGGATGAGCTTGCAACCCGTGGT 

ATACCTGAGTAAGGAAATTGATGTAGTGGCAAAGGGTT 



SEQ ID NO 8 (moo3-poo^) 
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3/66 



CO 



50 



60 



70 



CGC TTT CCC ACT ACA TCA ATT TTA. OCA GTA AOC AA CCC AAC OCA CAC TCC AOC TTA GTC CAA CAA CTC AOC 
PFATTS I LCVRKPNCQWRLVOE L R> 
a a a a A a & TRANSLATION OF KSKV-1 POL - (A) * * A A > 



80 



90 



100 



110 



120 



130 



HO 



ATT ATC AAT CAC OCT CTT CTT OCT CTA TAC OCA OCT CTA OCT AAC OCT TAT ACA CTC CTT TCC CAA ATA OCA 
IIN EAVVPLYPAVPNFYTVLSQ I P> 
JTRANSlATtOM OF KSRV-1 POL " (A)_ 



150 



160 



170 



160 



190 



210 



CAC CAA CCA CAC TCC TTT ACA CTC CTC CAC CTT AAC CAT GOC TTT TTC TCC ATC OCT CTA OCT OCT CAC TCT 



230 



^TRANSLATION OF KSRV-1 POL - 
240 2S0 260 



r 

|AJ_ 



CAA TTC TTC TTT OCC TTT CAA CAT OCT TTC AAC OCA AOC TCT CAA CTC AOC TCC ACT 



P T S 0 
1 OP KSRV-1 POL * 



L 
IAJ_ 



TTA OCC CAA CCC j 
L P 0 O J 



310 



320 



330j<- R 



340 



350 



360 



290 300 

TTC AOS CAT AOC 'CCC CAT CTA TTT CCC CAC OCA TTA OCC CAaIcAC TTC ACT CAA TTC TCA TAC CTC CAC ACT 
F R D S P H I- FCQAliAQlDX.SC F S Y L D T> 
* m » m m m m TR ANSLATION OF KSRV-1 POL * t Al * * * * A A A > 



370 
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390 



400 



410 



430 



430 



CTT CTC CTT CAC f TAC ATC CAT CAXfXTA CTT TTA CTC OCC OCT TCA CAA AOC TTC TCC CAT CAA CCC AOC CAA 

l v l q It m d dJ lllvarsetlchqato> 

£ -TXM OF KSRV-1 POL • f Al a a a * *, * * > 



440 



450 



460 



470 



460 



490 



500 



CAA CTC TTA ACT TTC CTC ACT AOC TCT CCC TAC AAC CTT TOC AAA OCA AAC OCT CCC CTC TCC TCA CAC CAC 
ELLTF J-TTCCYlCVSrpJCARLCSOE* 
m . m . * A «. TRANSLATION OF KSRV-1 POL - (Al__ 
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S20 



530 



S40 



SS0 



560 



570 



ATT ACA TAC TKA CCC CTA AAA TTA TCC AAA CCC AOC ACC COC CTC ACT CAC CAA CCT ATC CAC OCT ATA CTC 
I R Y X C X. XL LS K C T R A L S E ER X Q P X L> ~ 
* * m m * * m. TRANSLATION OF KSRV-1 POL * tAJ A A ■ A A * > 



560 



590 €00 610 620. €30 €40 

• m m • • * 

r CAT CCC AAA AOC CTA AAC CAA CTA ACA CCC TTC CTT CCC ATA ACA CCT TTC TCC CCA AAA CAC 



L R C F L C 
r OF KSRV-1 POL * (Al_ 



ITCrCRRO> 
AAA A A A A > 
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€70 



€60 



€90 
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710 



720 



ATT OCC ACC TAC ASC OCA ATA CCC ACA OCA TTA TAT ACA CTA ATT AM3 CAA ACT CAC AAA CCC AAT ACC TAT 
irRYXFXARPLYTLXXETQKAKTY* 
» * m m * m. m TRANgATIPN OF KSRV-1 POL * (A] * « * A * > 



730 



740 



750 



760 



770 



780 



790 



TTA CTA ACA TCC ACA CCT ACA CAA CTC CCT TTC CAC COC CTA AAC AAC COC CTA ACC CAA COC OCA CTC TTC 
L V R V T P TEVAFQALRRALT Q A FVF> 
a • m. M m m m. TRAKSLATTCM OF KSRV-1 POL * fAl a m a A A A A > 



600 



810 



820 



630 



640 



850 



860 



ACC TTC OCA ACA CCC CAA CAT TTT TCT TTA TAT GOC ACA CAA AAA ACA CCA ATA CCT CTA CCA CTC CTT ACC 
SL PTGQ DFSLYATEKTG IALCVLT> 
» m « * ■ * a TRANSLATION, OF KSRV-1 POL • *Al a n a A * A A > 



870 



680 



890 



900 



910 



920 



930 



CAC CTC TCA CCC ATC ACC TTC CAA CCC CTC CTA TAC CTC ACT AAC CAA ATT CAT CTA CTC CCA AAC- CCT TCC 
QVSGKSLQPVVYLSKEIDVVAKCW> 
^TRANSLATION OF KSRV-1 POL * I A] « a a # « A * > 



940 



950 



960 



970 



960 



990 



1000 



CCT CAT N3T TTA TCC CTA ATC CMC CCA CTA CCA GTC TWA CTA TCT CAA CCA CTT AAA ATA ATA CAC CCA ACA 
PHXLWVKXAVAVXVSEAVKI IQCR> 

I OF KSRV-1 POL * (Al A, 



1020 



1030 



1040 



10S0 



1060 



1070 



1080 



CAT CTT NCT CTC TCC ACA TCT CAT CAT CTC AAC CCC ATA CTC ACT CCT AAA CCA CAC TTC TCC TTC TCA CAC 
DLXVWTSHDVNGILTAKGDLWLSD> 
^TRANSLATION OF KSRV-1 POL * (A] A A * A « « a > 
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AAC CAT TTA CTT AAN TAT CAC OCT CTA TTA CTT CAA CAC CCA CTC CTC NGA CTC OCC ACT TCT OCA ACT CTT 
NHLLXYOALLLEEPVLXLRTCATL> 
m a a a a a a a TRANSLATION OF KSRV-1 POL * lA| * * a a 4 a a > 
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FlG.13 

SEQ ID NO 46 (FBd3) 

GTGCTGATTGGTGTATTTACAATCCTTTATCTAATCCGAAATGCCCATGTTG 

CAATATGGAAAGAAAGGGAGTTCCTAACCTCTGGGGGAACCCCCATTAAA 

TACCACAAGTAAATCATGGAGTTATTGCACACAGTGCAAAAACTCAAGGA 

GGTGGAAGTCTTACACTGCCAAAGCCATCAGAAAAGGGAAGAGGGGAGAA 

GAGCAGCATAAGTGGCTACAGAGGCAAGGAAAGACTAGCAGAAAGGAAA 

GAGAGAAAGAGACAGAAAGTCAGAGAGAGAGAGAGGAAGAGACAGAGCA 

CAAAGAGGGAGTCAGAGAGAGAGAGAGACAGAGAGTCAGAGAGAAGGAA 

AGAGAGAGAGGAAGAGACAAAGAATGAATCAAACAGAGAGACAGAAAGT 

CAGAGAGAGAGAGAGAGAGGAAGAGACAGAGAAAAAGAGGGAGTCAGAA 

AAAGAGAGACCAAAGAAGAAGTCCAAAGAGAAAGAAAGAGAGATGGAAG 

TAGTAAAGGAAAAACAGTGTACCCTATTCCTTTAAAAGCCGGGGTAAATTT 

AAAACCTATAATTGATAACTGAAGGTCTTCTCTGTAACCCTGTAACACTCC 

AATACCACCTTGTTGTCAAGTGTAAACAAGGGCGTAGCCCAAAAGCACTG 

AGGCCACTAACAACCCATAGCCTTCCTATCAAAATTCCTTAACCCAGCAGG 

TTTCCTAACAGGGGATCTAAATCTTAATTAATTACCATACAATGGTCCAAC 

GAGACTTAGGAGGAATTCCCTTCAGGACGGGAAGATAGATGCTTCCTCCCA 

GGCGATTAAGGGAGAAAGACACAATGGGTATTCAGTAAGTGCCAAGGGGA 

ACACITGTAGAAGCAAAGTTAGGAAAATTGCCAAATAATTGGTTTGCTCAA 

GAGTTGTTTGCACTCAGCCAAACCITGAAGTACTTGCAGAATCAGAAAGGA 

GCCATCTATACCAATTCTAAGTTAATATGGACTG AAGGA GGTTTTATTAAT 

ACCAAAGAGAAATTAAAATCCCAAACTTATAAGGTTTTCAACCAAAGTAA 

AGTTTGCTAAAAGTTAACAGCGTAACATGTATTATCCTACTACCACACACT 

CTCAAAGGATTTCTCAGACAGTTTGCAAGAAATAATGATATCTATCCTTAC 

TCrACAATCCCAAATAGACTCTTTGGCAGCAGTGACTCTCCAAAACCGTCA 

AGGCCTAGACCTCCTCAC1X3CTGAGAAAGGAGGACTCTGCACCTTCTTAAG 

GGAAGAGTGTTGTCrTTACACTAACCAGTCAGGGATAGTATGAGATGCTGC 

CCGGCATTTACAGAAAAAGGCITCTGAAATCAGACAACGCCTTTCAAATTC 

CTATACCAACCTCTGGAGTrGGGCAACATGGTITCTTCCCTI^ATGTCCC 

ATGGCTGCCATCTTGCTATTACTCGCCTTTGGGCCCTGTATTTTTAACCTCC 

TTGTCAAATTTGTTTCTTCTAGGATCGAGGCCATCAAGCTACAGATGGTCTT 

ACAAATGGAACCCCAAATGAGCTCAACTATCAACTTCTACTGAGGACCCCT 

AGACCAACCCCCTGGCCCTTTCACTGGCCTAAAGAGTTCCCCTCTGGAGGA 

CACTACCACTGCAGGGCCCCATCTTTGCCCCTATCCAGAAGGAAGTAGCTA 

GAGCAGTCATTGCCCAATTCCCAAGAGCAGCTGGGGTGTCCCGTTTAGAGT 

GGGGATTGAGAGGTGAAGCCAGCTGGACTTCTGGGTCGGGTGGGGACTTG 

GAGAACTTTTGTGTCTAGCTAAAGGATTGTAAATGCAACAATCAGTGCTCT 

GTGTCTAGCTAAAGGATTGTAAATACACCAATCAGCAC 
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SEQ ID NO 5 1 (tpol) 



GGCTGCTAAAGGAGACITGTGGTTGTCAGACAATCGCCTACTTAGGTACCA 

GGCCTTATTACTTGAGGGACTGGTGCTTCAGATGCGCACTTGTGCAGCTCT 

TAACCCAAAGTTATGCTGCCCAGAAGGATCTTTTAGAGGTCCCCTTAGCCA 

ACCCTGACCTCAACCTATATATATACTGATGGAAGTTCGTTTGTAGAAAAG 

GGATTACAAAGGGNAGGATATNCCATAGGTTAGTGATAAAGCAGTACTTG 

AAAGTAAGCCTCTTCCCCCCAGGGACCAGCGCCCCCGTTAGCAGAACTAGT 

GGCACTGACCCCGAGCCTTAGAACTTGGAAAGGGAGGAGGATAAATGTGT 

ATACAGATAGCAAGTATGCTTATCTAATCCGAAATGCCCATGTTG 



5/9/2006, EAST Version: 2.0.3.0 



WO 98/23755 



PCT/IB97/01482 



15163 



SEQIDN0 52(JLBcl) 

tc aggg atagccccc atctatttggtc aggc actgg ccc a ag atctaggg a 
catgccacttttaagagccatttctcaagtccaggtactctggtccitcggt 
atgtggatgatttacttttggctaccagttcagtagcctcatgccagcagg 

CTACTCTAGATCTCTTGAACrrTCTAGCTAATCAAGGGTACAAGGCATCTA 

GGTTGAAGGCCCAGCTTTGCCTACAGCAGGTCAAATATCTAGGCCTAATCT 

TAGCCAGAGGGACCAGGGCACTCAGCAAGGAACAAATACAGCCTATACTG 

GCTTATC^CACCCTAAGACATTAAAACAGTTGCGGGGGTTCCTTGGAATC 

ACTGGCTTTTTGGTGACTATGGATTCCCAGATACAGCAAGATTGGCAGGCC 

CCTCTATACTGTAATCAAGGAGACTCACGAGGGCAAGTACTCATCTAGTAG 

AATGGGAACTAGGGACAGAAACAGCCTTCAAAACCTTAAAGCAGGCCCTA 

GTACAATCTCCAGCTTTAAGCCTTCCCACAGGACAAAACTTCTCTTTATAC 

ATCACAGAGAGGGCAGAGATAGCTCTTGGTGTCCTTATTCAGACTCATGGG 

ACTACCCCACAACCAGTGGCACACCTAAGTAAGGAAATTGATGTAGTAGC 

AAAAGGCTGGCCTCACTGTTTATGGGTAGCTGTGGTGGTGGCTGTCTTAGT 

GTCAGAAGCTATCAAAATAATACAAGGAAAGGATCTCACTGTCTGGACTA 

CTCATGATGTAATGGCATACTAGGTGCCAAAAGAAGTTTATGGGTATCAGA 

CAACCACCTGCTTAGATACCAGGGACTACTCCTGGAGGATTGGGCTTCAAG 

TGCGTTTTTrGTGGCCTCAACCCrGCCACTTTTCCTCCAGAGGATGGAGAG 

CCGCTTGAGCATGCTTGCCAACAGGTTGTAGGCCAGAATTATTCCACCCGA 

GATGATCTCTTAGAGTACCCTTAGCTAATCCTGACCTTAACCTATATACCA 

ATGGAAGTTCATTTGTGGAAAACGGGATATGAAGGGCAGGTTATGTCATAG 

TTAGTGATGTAATCATACTTGCAAGTAAGCCTCTTACCCCAGGGGCCAGCA 

CTCAGTTAGCAGAACTAGTCACACTTACCTTAACCTTAGAACTGGGAAAGG 

GAAAAAGAATAAATATGTATACAGATAGTAAGTATGCTTATCTAATCCTAC 

ATGCCCATGCTGCAATATGGAAGGAAAGGGAGTTCCTAACCCCTGGGGGA 

ACCCCCATTAAATACCACAAGGYAAATCATGGAGTTATTGCACGCAGTGC 

AAAAACTCAAGGAGGTGGCAGTCTTACACTGCCGAAGCYATCAAAAAGGG 

GAAGGAGAGGGGAGAACAGCAGCATAAGTGGTTGGCAGAGGCAGTGAAA 

GACCAGCAGAGAGAAGGAGAGAGACAACGTCAACGACAGAAGGAAAGAA 

GAGGAGGAGACAGAGAGGAAGAGACAGAGAGACAGTTAGTCCAAGAGAG 

AGACAGAGAGAGGAAGAGACAGACAGAAAGTCCAAGAGAGAAGGAAAGA 

GAGGAAGAGACCAAGGAGTCCNAGAGAGAGAAAGAGATAGAAGTAGTAA 

AGAAAAAACATTGTACCCTATTCCTTTAAAAGCCGGGGTATATTTAAAACC 

^AT^TTGATAAT^AG^ 

AAACCCTCAACCGATATGTGAAAATTGTGGGTCGTCCCTATGTCTCAATTA 

CCAGCCAATACCCCCTTGTTTTTAGTGTGAACGAGGGTGTAGAGCGCAGAC 

AGGGAGACCTCTGACAATCCATACCCTTCCTATCCAAA^CC^AACCCAG 

CAGGTTTTCTAAAAGGGGATCTAAATCTTAATTAATTACCATACAAAGGTC 

AAACCAGATCTAGGAGGAACTTCCTTCAGGACAGGATGATAGATGGTTCCT 

C^CAGGCGATTAAAGAAAATAAAAAGACACATGGGCAGCCAGTAAGTGAT 

AAGGGAACACTAGTAGAAGCAGTTAGGAGAAGTTGCCTAA 
ACTCCAAATGTGTGAGTTGTTCGCACTCAGCCCAAATCTTAAAGTACTTAC 

GAGGTTTTATTAATAGCGAAGGAGAATTAAATCCTAAACTNACAAGGTTTT 

caactIaIgtaaattttactaaaagctaacag^ 

CTACAACACACTCTCANAGGATTCCTCAGACAGTTTACAAGAAATAACAA 

aatctatctc^ 
cagtgactctc 

FIG 16 
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SEQID N0 53(JLBc2) 
TCAGGGATAGCCCXrCATCTATTTGATCAGGCACTAGCCCAAGATCTAGGCC 
ACTTCTGAAGTCCAGGCATTCTAGTCCTTCAGTATGTGGATGATTTACTTTT 
GGCTACCAGTTTGGAAGCCTCATGCCAGCAGGCTACTTGAGATCTCTTGAA 
CITTCTAGCTAATCAAGGGTGTATGGCATCTAAATTGAAAGTCCAGCTCTG 
CCrACAACAAGTCAAATATCTAGGCCTAATCTTAGATAGAAGAACCAGGG 
CCCTCAGCAAGGAATGAATAAAGCCTATGCTGGCTTA TCGGC ACCCTAAGA 
CATTAAAACAATTGTGGGGGTTCCTTGGAATCACTGGCTTTTGCCGACTAT 
GGATCCCTGGATAGAGTGAGATAGCCAGGCCCCCTCTATTACTCTTATCAA 
GGAGACCCAGAGGGCAAATACTTATCTAGTATTATGGGNACCAGAGGCAG 
AAAAAGCCTTCCAAACCTTAAAGGAGACCCTAGTACAAGCTCCAGCTTTAA 
GCCTTCCCACAGGACAAANCTTCTCTTTATATGTCACAGAGAGAGCAGGAA 
TAGCTCCTGGAGTCCTTACTCAGACTTTTGGACGACCCCACGGCCAGTGGC 
RTACCTAAGTAAGGAAATTGATGTAGTAGCAAAAGGCTGGCCTCACTGTTT 
ATGGGTAGTTGCGGCTGTGGCAGTCTTACTGTCAAAGGCTATCAAAATAAT 
ACAAGGAAAGGATTTCACTATCTGGACTACTCATGAGGAAAATGGCATATT 
AGGTGCCAAAGGAAGTTTTTGGCTATCAGACAACCACCTGCTCAGATrCCA 
GGCACTACn3ATTGAGAGACCAGTGCTTTAAATATGTATGTGTGTGTGTGG 
CCCTCAACCCTGCCACTGTTCTCCCAGAAGATGGAGAACCAATGAAGCATT 
ACTGTCAACAAATTAGAGTCCAGAGTTATGCTGCCTGAGAGGATCTCTTAG 
AAGTCCCCTTAGCTAATCCTGACCTTAACCTATATGCTGATGGAAGTTCAC 
TTGTGGAGAATGGGATACGAAAAGCACATTATGCCATAGTTAGTGAGGTA 
ACAGTACTTGAAAGTAAGCCTATTCCCCCATGGACCAGAGCCCAGTTAGCA 
GAACTAGTGGCACTTACCCAAGCCTTAGAACTAGGAAAGGGAAAAATAAT 
AAATGTGTATACAGATAGCAAGTATGCTTATCTAATCCTACATGCCCATGC 
TGCAGTATGGAAAGAAAGGGAGTTCCTAACCTCTGGGGGAACCCCCATTA 
AATACCACAAGGCAAATCATGGAGTTATTGCATGTAGTGCAAAACCTCAA 
GTAGGTGGCAGTTTTACACTGCCTGAAGCTATGGGGAAGGAGAGAGGAGA 
ACAGCAGCATAAGTCGCTAGCAGAGGCAGCGAAAGACTAGCAGAGAGGA 
GAGGTAGGGGAAAGACAGAAAGTCAAAGAAAAGAAGTCAAAGACAGACA 
GAGAAAGAGACAGAGGGAGCCAGAGAGAAAGAAAAGAGAGAACGAAAGA 
GACAGAATGTCAAAGAACAGAAGAGAGAGGCAGCGCCAGAAGAGTTAAG 
AAAGTGAGAAAGAGAGATGGAAATAGTAAAGAA AAAAC AGTGTACCCTAT 
TCCTTTAAAAGCCAGGGTAAATTTAAAACGTATAATTTTATAATTGGAAGG 
TCTTCTCCATAACCCTATAACATTAAAATACCACC TiO 1 1 GTCAGTGTAAAC 
AAGAGCATAGCCCAAAAGCACTGAGGCCACTGAGAACCCATAGCCTTCCT 
ATCAAAAATCCTTAACTCTGCAGGTTTCCTAACAGGGGATCTAAATCTCAA 
CTAATCACCATACAATGGTCCGACCAGACCTAGGAGCGACTCCCCTCAGG 
ACAGAAGGATGGATGGTTCCTCCCAGGCCATTAAGGGAAAGAGACACAAT 
GGGTATTCAGTAAGTGATAAGGGAACTCTTGTAGAAGCAGTTAGGAAGATT 
GCCTAATATrrGGTCTGCTCAAATGTGCCAGCTGTTTGCACTCAGCTAAAC 
CTTAAATTACTTACAGAATTAGGAAGGAGCCATCTATACCAATTCTGAGTT 
AATATGAGCTGAACAAGTTCTTATTAATAGCAAAGAATCATTGAAATCTCA 
AACTTGCAAAGTrTTCAACAAAAGTAAAGTTTGCTGAAAGTTAGCAGTGTA 
ACATGTATTATCCTAACTTCTAATCTTGTGGAAATCAGACCCTATCAGTGC 
CCCTCAAAGCTGAAGTCCATCAGCATATGGCCATACAACTAATACCCCTAT 
TTATAGGGTTAGGAATGGCCACTGCTACAGGAATGGGAGTAACAGGTTTAT 
CTACTTCATTATCCTATTACCACACACTCTTAAAGGATTTCTCAGACAGTTT 
ACAAGAAATAACAAAATCTATCCTTACTCTNTARTCCCAAATAGRTTCTTT 

GGCAGCAGTGACTCTC 

FIG. 17 
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1 TTCCTGAGTT CTTGCACTAA CCTCAAATGA GAGAAGTGCC GCCATAACTG CAACCCAAGA 

61 GTTTGGCGAT CCCTGGTATC TCAGTCAGGT CAATGACAGG ATGACAACAG AGGAAAGATA 

121 ATGATTCCCC ACAGGCCAGC AGGCAGTTCC CAGTGTAGAC CCTCATTAGG ACACAGAATC 

181 AGAACATGGA GATTGGTGCC GCAGACATTT GCTAACTTGC GTGCTAGAAG GACTAAGGAA 

241 AACTAGGAAG ATATGAATTA TTCAATGATG TCCACTATAA CACAGGGGAA AGGAAGAAAA 

301 TCCTACTGCC TTTCTGGAGA GACTAAGGGA GGCATTGAGG AAGCATACCA GGCAAGTGGA 

361 CATTGGAGGC TCTGGAAAAG GGAAAAGTTG GGAAAAGTAT ATGTCTAATA GGGCTTGCTT 

421 CCAGTGTGGT CTACAAGGAC ACTTTAAAAA AGATTGTCCA ATAGAAATAA GCCACCACCT 

481 CGTCCATGCC CCTTATGTCA AGGGAATCAC TGGAAGGCCC ACTGCCCCAG GGGATGAAGG 

541 TCCTCTGAGT CAGAAGCCAC TAACCAGATG ATCCAGCAGC AGGACTGAGG GTGCCCGGGG 

601 CAAGCGCCAG CCCATGCCAT CACCCTCACA GAGCCCCAGG TATGCTTGAC CATTGAGGGT 

661 CAGAAGGGTA CTGTCTCCTG GACACTGGCG GGCCTTCTCA GTCTTACTTT CCTGTCCTGG 

721 ACAACTGTCC TCCAGATCTG TCACTGTCCG AGGGGTCCTA GGACAGCCAG TCACTAGATA 

781 CTTCTCCCAG CCACTAAGTT GTGACTGGGG AACTTTACTC TTCCACATGC TTTTCTAATT 

841 ATGCCTGAAA GCCCCACTCT CTTGTTAGGG GAGAGACATT CTAGCAAAAG CAGGGGCCAT 

901 TATACATGTG AATATAGGAG AAGGAACAAC TGTTTGTTGT CCCCTGCTTG AGGAAGGAAT 

961 TAATCCTGAA GTCCGGGCAA CAGAAGGACA ATATGGACAA GCAAAGAATG CCCGTCCTGT 

1021 TCAAGTTAAA CTAAAGGATT CCACCTCCTT TCCCTACCAA AGGCAGTACC CCCTCAGACC 

1081 CGAGACCCAA CAAGAACTCC AAAAGATTGT AAAGGACCTA AAAGCCCAAG GCCTAGTAAA 

1141 ACCAAGCAAT AGCCCTTGCA AGACTCCAAT TTTAGGAGTA AGGAAACCCA ACGGAC 



SEQ ID NO 56 (GM3) 
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GATGCCTTTTTCTGCATCCCTGTACGTCCTGACTCTCAATTCTTGTTTGCCTTTGAAG 
ATCCTTTGAACCC^CGTCTCAACTCACCTGGACTGTTTTACCCC^AGGGTTCAGGGA 
TAGCCCCATCTATTTGGCCAGGC^TTAGCCCAAGATGCCTTTTGCATCCCTGTACGTG 

ACTCTCAATTCTTGTTTGCCTTTGCCTTTGAAGATGCTTTG 
CACCTGGACTGTTTTACGCCAAGGGTTCAGGGATAGCCCCCATCTATTTGGC 

CAGGCATTAGCCCAA SE Q ID NO 40 



Asp-Ala-Phe-Phe-Cys-Ile-Pro-Val-Arg-Pro-Asp-Ser-Gln-Phe- 
Leu-Phe-Ala-Phe-Glu-Asp-Pro-Leu-Asn-Pro-Thr-Ser-Gln-Leu- 
Thr-Trp-Thr-Val-Leu-Pro-Gln-Gly-Phe-Arg-Asp-Ser-Pro-His- 

Leu-Phe-Gly-Gln-Ala-Leu-Ala-Gln 

SEQ ID NO 39 (POL2B) 
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FIG. 34 



Cys-Ile-Pro-Val-Arg-Pro-Asp-Ser-Gln-Phe-Leu SEQ ID NO 41 



Val-Leu-Pro-Gln-Gly-Phe-Arg-Asp-Ser-Pro-His-Leu-Phe-Gly- 

SEQ ID NO 4 2 

Gin-Ala-Leu-Ala 



Leu-Phe-Ala-Phe-Glu-Asp-Pro-Leu SEQ ID NO 4 3 

Phe-Ala-Phe-Glu-Asp-Pro-Leu-Asn SEQ ID NO 44 
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CAEAGACAAA GGAGEAAACA ATCAADCAAA GAGIQOCAAT AITOOCIGCT 100 
IDK GVNN EPK SAN IPWL 
.TK E.T MNQR VPI FPG 
HRQR SKQ .TK ECQY S L V 

TA3GCADCCT OO&GCGGIG GGAGAAGAAT TO3Q00CAGC C3GAGIGCAT 150 

CTL Q A V GEEF GPA RVH 
YAPS KRW EKN SAQP ECM 
MHP PSGG RRI RPS QSAC 

GEAOCTTTTT C1UIUICACA CTIGAAGCAA ATEAAAATAG iOJEAGGDEV 200 
VPFS LSH LKQ IKID XGX 
YLF LSHT .SK L K . T.VN 
TFF SLT LEAN . N R XRX. 

AT1N1LIAGAT JOXJCTGAIG GYTATATIGA TGITTTACAA GGA3TB3GAC 250 
XSD SPDG YID VLQ GLGQ 

XQI ALM XILM FYK D.D 
I X R . P.W LY. CF TR IRT 

AATCCTITCA TCT3ACAIGG AGAG^EATAA TATTACIGCT AAATCAGBCG 300 

S F D L T W R D I I L L L N Q T 
NPLI . H G EI. YYC. IRR 
IL. S D M E RYN ITA KSDA 

CIAACCICAA A3GAGAGAAG TQCTO0CA3A AZDGGAGOOC GftGBGITTGG 350 
LTSN ERS A A I TGAR EFG 
.PQ MREV LP. LEP ESLA 
NLK . E K CCHN WSP RVW 

OASUICIGG TA1CICAGIC A3GTCAATCA. TAGGATCACA A33GAGGAAA 400 
NLW YLSQ VND RMT TEER 
ISG ISV RSMI G.Q RR K 
QSLV SQS G Q . . D D N G G< : K 

GAGAACGATT OXCACAG3G CAGCAGQCAG TIDOCTOIGr iO^ICCTCAT 450 

ERF PTG QQAV PSV APH 
ENDS PQG SRQ FPV. LLI 
RTI PHRA AGS SQC SSSL 

TO33ACACAG AATCAGAACA TOGAGATTCG TOOOGCAGAC ATTIA 495 
WDTE SEH GDW CRRH L 
GTQ NQNM EIG A A D I 
GHR IRT WRL V PQT F 
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FIG 36 
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CITOCXXMC TAATAMGAC COCXXTTKA MOCAAACAG TOCAAAMQA SO 
LPQL IRT PLS TQTV QKD 

CATOGACAAA GSAGEAAftCA ATOftftCCAAA GfiGIQOCAAT MTOOCIQCT 100 
IDK G V N N EPK SAN IPWL 

TATOCAOXT CEAftGOQGTC GG&GAMAAT CAGAGIGCAT 150 

CTL Q A V GEEF GPA RVH 

GERDCTinT CTOICTCACA CITCftAGCAA AXTAAAATftG AOCTftGSEftA 200 
VPFS LSH LKQ IKID LGK 

MTCIOGAT MOOCiraro GYEA03OTGA TOITmCAA GGATIMGftC 250 
FSD SPDG YID VLQ GLGQ 

AAIDCTTIGA TCIGftCATOS MftGftEAEAA TOTTA CI GCT PAKTCP&CG 300 
SFD LTW RDII LLL NQT 

CTAACCTCAA ATCAfflG&SG TOCIGCXMA ACIGGMOCC GMAGITIGG 350 
LTSN ERS A A I TGAR EFG 

CAArCICIQG OMCIC^IC ATTC&tfTCA TPO^ATCACA ACQGMGftAA 400 
NLW YLSQ VND RMT TEER 

GftGAAOGATT OCOC2VOV333 Cft3CAG3CM3 TTOCC&3IGT A3CTOCICAT 450 
ERF PTG QQAV PSV APH 

AATCfiGAACA TO3&GA3TOG TQ003CAG&C ATTEftCAACr 500 
WDTE SEH GDW CRRH LQL 

TOCCT3CEAN A&QGACT^G GAAAACT^GG ATTATICAAN 550 

ACX KDXG KLG RLX IIQX 

GATOIOCACT AM£\CACfiG3 QC¥AW3W£ AAAATOCTOC TQOCTHCIG 600 
CPL XHR GKEE NPT AFL 

GftG&GftCIfcA GGGAGGCATT G&QGAA3COT ACC&3XAAG TOGACMTOG 650 
ERLR EAL RKH TRQV DIG 

AAM9G&AAA GITOGQCAAA TIATATOOCT AATAG3GOT 700 
GSG KGKS WAN YMP N R : A C 

GCnOCfiOTG CAGICTACAA GGACQCITIA GAAAAGftTIG TOCWOTAGA 750 
FQC SLQ GRFR KDC PSR 

AMAMOCBC OXlOCmJCA TOCEOCTTOT CTCAfiGQGftA TCACTSSAftG 800 
NKPP LVH APY VKGI TGR 

QOCIACIQQC OCft333GftOG AMGfKCTCT GMICftSftAG OCACIAMCT 850 
PTA PGDE GPL SQK PLT. 

<A 8S2 
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AAGGAAftCIC AGAAAQOCAA TAOCXMTIA GEA?CA!IGGA CAOCAGMGC 50 
KETQ KAN THL VRWT PEA 
RKL RKPI PI. . D G HQKQ 
GNS ESQ YPFS KMD TRS 

AG&AGCAGCT TIOCftGOOOC TAAAGAAATC OC7EAAOCEAA Q00OCAGICT 100 
E A A FQAL KKS LTQ APVL 
KQL SRP . R N P . P K PQC 
RSSF PGP KEI PNPS PSV 

TMGCTIGOC AAOQQ3QCAA GftLTlTldT TATftlGICAC A3AAAAAOG 150 

SLP TGQ DFSL YVT EKQ 
.ACQ RGK TFL YMSQ KNR 
KLA NGAR LFF ICH RKTG 

GAATAGCIUT AGGftGTKXTT A3OG3I0C ABGOGACAAG CITOCAAGCT 200 
E.L. ESL HRS KGQA CNL 
NSS RSPY TGP RDK LATC 
I A L GVL TQVQ GTS LQP 

GIQQC^TADC TG^GIAAGGA. AACIGATOm NIGQCAAAQS GTTOGOCTCA 250 
WHT . VRK LMX WQR VGLI 
GIP E.G N.CX GKG LAS 
VAYL SKE TDV XAKG WPH 

TIGTTEACAG GEAGGQC&3C AGEAGCAGIC TIAGTTTTCIG AAACAGTTAA 300 

VYR .GS SSSL SF. NS. 
LFTG R A A VAV LVSE TVK 
CLQ VGQQ .QS . F L KQLK 

AA3AAEACAG QGAAGAGAIC TEOGIGIG G30ECICAT GA3CTGAAOG 350 
N N T G KRS YCV DIS. CER 
IIQ GRDL TVW TSH DVNG 
.YR EEI LLCG HLM M.T 

QCATACTCAC TQCIAAAGAG GACTIGIQQC TCICAGACAA (XATTEACTT 400 
HTH C.RG L V A V R Q P F T 
ILT AKE DLWL SDN HLL 
AYSL LKR TCG CQTT IYL. 

AAAEfiQCaQG T lL'i m ' EA CT TGASGIGOCA GIGCIGCGAC TOCACAITIG 450 

I A G SIT .SAS A A T AHL 
K.QV LLL EVP VLRL HIC 
NSR FYYL KCQ CCD CTFV 

TOCAACICIT A&OOCAGOCA CMTICITOC A3ACAAIGAA GAAAAGAIAG 500 
CNS. PSH ISS RQ.R KDR 
ATL NPAT FLP DNE EKIE 
QLL TQP HFFQ TMK KR. 
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IvJ^O AACAIAACIG TCAACAAGIA ATIGCTCAAA OCTATOCIGC TOGAQQQGAC 550 
U T.L STSN CSN LCC SRGP 
U HNCQQVIAQTYAARGD 
NITV N K . LLK P M £T L E G T 

TIOXTIGAC TGATOOCGAC CICAACTIGT AEACTGATO3 600 
SRG SLD .SRP QLV Y.W 
LLEV PLT DPD LNLY TDG 
F.R FP.L IPT STC ILME 

AAGTKXTIG GCAGAAAAAG GACTTTGAAA AG0Q3GCTAT GCAGIGATCA 650 
KFLG RKR TLK SGVC SDQ 
SSL AEKG L.K AGY AVIS 
VPW QKK DFEK RGM Q . S 

GIGATAATOG AATACTIGAA AGEAATCGOC TCACTOCMG AACTAGIGCT 700 
..WNT.K .SPHSRN.CS 
DNG ILE SNRL TPG TSA 
VIME YLK VIA SLQE L V L 

CAOCTOQCAG AACTAA22GC OCTCftCTIGG GCACEAGAAT TBGGAGAAGG 750 

PGR TNS PHLG TRI RRR 
HLAE L I A LTW ALEL GEG 
TWQN..PSLGH.N . EKE 

AAAAAGQGEA AATATATATT CTGACIUTAA GEATOCTIAC CTAGIOCICC 800 
K K G K. Y I F ....R_,L_ . V C L P JS„. P .P 
KRV NIYS DSK YAY L VL H 
KG. IYI QT LS MLT . S S 

ATOCXXA3QC AGCAAEAIGG AGAGBGAGQG AATTOOTAAC TIUIGMQGA 850 
CPC SNME REG IPN F.GN 
AHA AIW RERE FLT SEG 
MPMQ QYG ERG NS.L LRE 

ACAOCIETCA AOCATCAQGG AAGOCATEAG G&GATIATIA TIGGCTGEAC 900 

TYQ PSG KPLG DYY WLY 
TPIN HQG SH. E I I I GCT 
HLS TIRE AIR RLL L A V Q 

AGAAAOCIAA AGAGGIGQCA GIUTEACACT G0CAG3GICA TCftGGAAGAA 950 
RNLK RWQ SYT ARVI RKK 
ET. RGGS LTL PGS SGRR 
KPK EVA VLHC QGH QEE 

GAGGAAAGQG AAATAGAAGG CAA3D30CAA G0GGA3ATIG AMCAAAAAA 1000 
RKG K.KA I A K RIL KQKK 
GKG NRR QSPS GY. SKK 
EERE IEG NRQ ADIE AKK 
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CEQGACTCIC CATITiGAAAT QCTEATftGAA QGAOCXXTAG 1050 
PQG RTL H.KC L . K DP. 
SRKA GLS IRN AYRR TPS 
AAR QDSP LEM LIE GPLV 

TA3GQQCTAA TCDCXTCIGG GAAAOCA&3C OOC^GTACIC A3CMG&AAA 1100 
YGVI PSG K PS PSTQ QEK 
M G . SPLG NQA PVL SRKN 
WGN PLW ETKP QYS AGK 

AEBGAAT&GG AAAOCTCACA AGGftCAIEACT TICCTOOCCT OCAGAIQQCT 1150 
. N R KPHK DIL SSP PDG. 
RIG N Lr~T R .T Ybr-E... E. P..L . Q M A 

IE.E TSQ GHT FLPS RWL 

AGOCSCIGBG GAAQGAA 1167 

P L R K E 
S H . G R 
ATE EG 
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AALTlGGCiTC T/^MSO" AIGAATIMT 50 

NLRA RRT KEN .EDY ELF 
TCV LEGL RKT RKT MNYS 
FIG 39 LAC • K D . G K L GRL .II 

Q CAATCATCIC CAOTMCA CftGQQGAMG GAAG&AAATC CTACIGOCTT 100 
NDV HYNT GER KKI LLPF 
MMS TIT QGKG RKS YCL 
Q.CP L.H RGK EENP TAF 

TCIQGSGAC21 CTAAGG3AQG CATUSO^A GCATAOCAGG CAAGIGGACA 150 

WRD . G R H.GS IPG KWT 
SGET KGG IEE AYQA SGH 
LER LREA LRK HTR QVDI 

TIQGAGQCIC TOGAAA&3GG AAAACTIQQG CAAATIGAAT 200 
LEAL EKG KVG QIEC LIG 
WRL WKRE KLG KLN A - . G 
GGS GKG KSWA N.M PNR 

G LT1ULT1 D D AGTCQCAGICr TITftGAAAaG MTCTGCAAG 250 

LAS SAVY K D A LEK IVQV 
LLP VQS TRTL . K R LSK 
ACFQ CSL QGR FRKD CPS 

TBGAAAE&AG 003000CTOG TOCAT3000C TEATCICAAG GGAATCACIG 300 

EIS RPS SMPL M S R ESL 
. K . A APR PCP LCQG NHW 
RNK PPLV HAP YVK GITG 

GAMQQCTAC TQQOOCAGGG GADGAAQ3TC CICIGAGICA GAAGCX30A 350 
EGLL PQG TKV L.VR SH. 
KAY CPRG RRS SES EATN 
RPT APG DEGP LSQ KPL 

ACCIGA3GAT GCMCMCAG GACIGBGGCT G00CX9QGQCA ASTOQCAGCT 400 
PDD PAAG LRV PGA SASP 
L M I QQQ D.GC PGQ VPA 
T ..S SSR TEG ARGK CQP 

CAT3QCAICA CXX7ICB3AGC (XXX3GGIMG TITG302AIT GAGAG0CAG3 450 

CHH PQS PGYV .PL RAR 
HAIT LRA PGM FDH. EPG 
MPS PSEP RVC LTI ESQE 

AAGTEAACIG TCICCIGGAC ACIGGOQCAG OCTTCTCBGT CTIACTTTOC 500 
KLTV SWT LAQ PSQS YFP 
S.L SPGH WRS LLS LTFL 
VNC LLD TGAA FSV LLS 
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PIC"" OQ TCIOXSGAC AACTSTOCIC CfiGAICIGIC ACTAimSG Q3SIOCiaaG 550 
1 t O O 7 v P D N C P P D L S L S E G S . D 

SQT IVL QICH YPR GPK 
CPRQ LSS R S V TIRG VLR 

AOODCAGIC ACTACATACT TCICTCMOC ACIAACTIGT GACTOQQGftA 600 

SQS LHT SLSH . V V TGE 
TASH YIL LSA TKL. LGN 
QPV TTYF SQP LSC DWGT 

CITTACTCTr TICACA1QCT TITCIAATIA TOOCIGAAAG aDOCACIOCC 650 
LYSF HML F.L CLKA PLP 
FTL FTCF SNY A . K PHSL 
LLF SHA FLIM PES PTP 

TlUrUAGQSV GPGACKTYIT AGCAAAAGCA G33Q0CATEA T30£CTGAA 700 
C.G ETF. QKQ GPL YT.T 
VRE RHF SKSR GHY TPE 
LLGR DIL AKA GAII HLN 

CATAQGAAAA QGAATAOCCA TITQCIGICC OL'lQJl'IGAG GAAGGAAITEA 750 

.EK EYP FAVP CLR KEL 
HRKR NTH LLS PA.G R N . 
IGK GIPI CCP L L E , E G I N 

A3XXJIGAW3T CTQQQCAAEA GAAGGACAAT A3QGACAAGC AAAGAAUGCC 800 
ILKS GQ. KDN MDKQ RMP 
S.S LGNR RTI WTS KECP 
PEV WAI EGQY G Q A KNA 

QJlCClOriC AaGTEAAfiCr AAAQGaTICT QOLUtlTl'lC (XTOOCAA^G 850 
VLF K L N . RIL PPF PTKG 

SCS S.T KGFC L LS LPK 
RPVQ VKL KDS ASFP YQR 

GAAGTCAOXT CTEAGAO00G AQGOOCTACA MG^CTCAAA AGATIUTEAA 900 

STL LDP RPYK DSK DC. 
EVPS . T R GPT RTQK IV K 
KYP LRPE ALQ GLK RLLR 

QGADCIAAAA GCDCAAG90C TAGTAAAAQC A3X3CAGEAGC OOCTGCAAEA 950 
GPKS PRP SKT M Q . P LQY 
DLK AQGL VKP CSS PCNT 
T.KPKA ..NHAVAPAI 

CTCCAATTIT AQGAGEAMG AAAOOCAAGG GACAGIOGAG GTTAG7IGCAA 1000 
SNF RSKE TQR TVE VSAR 
PIL GVR KPNG QWR LVQ 
LQF.E.GNPTDSGG.CK 
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C GATCICAGGA TEATTAATCA GGCIGTITIT OCJTCEATAOC OGCTCIATC 1050 

S Q D Y . . GCFS SIP SCI 
DLRI INE A V F PLYP AVS 
ISG LLMR LFF LYT QLYL 

TAGOCCTTAT ACICIGCTTT OOCTAATACC A3AGGAAGCA GAGEAGITIA 1100 
PLY SAF P N T RGSR VVY 
SPY TLLS LIP EEA E.FT 
A L I LCF P.YQ RKQ SSL 

CAGTOCIQGA QCTEAMGAT GOL'lLTl'lLT QCATOOCIGT A^ATOCTGAT 1150 
SPG P . G C LFL HPC T S . F 
VLD LKD ASFC IPV HPD 
QSWT LRM PLS ASLY ILI 

TCTCAATTCT TOTTGIUIT TGAAGATCCT TK5AAOEAA TCICICAATT 1200 

SIL VCL . R S F EPN VSI 
SQFL FVF EDP LNPM SQF 
LNS CLSL KIL . T Q CLNS 

cagctqgact Grrrivmr AG9QSITO0G GGATMQOOC CATCEATTIG 1250 
HLDC FTP GVP G . P P SIW 
TWT VLPQ GFR DSP HLFG 
PGL FYP RGSG IAP IYL 

G0O^33CNTT AOO0CAA3AC TiraGOCAAT TCICATAOCT GGACAliLTIG 1300 
PGI SPRL EPI LIP GHLV 
QAL AQD LSQF SYL DIL 
ARH. PKT .AN SHTW TSC 

ranrcGCTA tgqgmgatt taatttimc CAexmncA gaaaoctict 1350 

LRY GMI . F . P PVQ KPC 
SFGM G . F NFS HPFR NLV 
PSV WDDL I L A TRS ETLC 

GOCATCAAGC CADOCAAGOG TICTEAAATT TOCTCACTCE GIGIQGCTAC 1400 
AIK^P PKR S.I SSLR VAT 
PSS HPSV LKF PHS VWLQ 
HQA TQA FLNF LTP CGY 

AAGGTITOCA AADCAAAGQC TCAGCTCIGC TCACM3CAQG TEAAATACTT 1450 
RFP NQRL SSA HSR LNT. 
GFQ TKG SALL TAG . I L 
KVSK PKA QLC SQQV KYL 

AGGGfTEAAAA TEATCCAAAG GCAOCAGQGC GCICIGIGAG GAA3X5TATOC 1500 

G.N YPK APGP SVR NVS 
RVKI I Q R HQG PL.G MYP 
GLK LSKG TRA LCE ECIQ 
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AAOCTCTACT GQLTIML'IT CATOOCAAAA GOCTAMQCA ACTAAGAAGG 1550 
NLYW LIF IPK P.SN .EG 
TCT GLSS SQN PKA TKKV 
PVL AYL HPKT LKQ LRR 

ranroocAT aaogottic tooogaa 1577 

PWH NRFL P 
LGI TGF CR 
SLA. QVS AE 
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TOCAGCAQCA QGACIGAGQG TOCXX33QQQC AAGIGOCAQC CO^XXATC 50 
SSSR TEG ARG KCQP MPS 

xjxnEM&G ocmmcAT gitigaocat tcagagcog gaagteaact 100 

PSE PRVC L T I ESQ EVNC 

GIUIOCTOGA CACT33QQCA QOOTKJICNS TCmCTTIC CTGICCTAG^ 150 
LLD TGA AFSV LLS CPR 

CAATICTOCT COOECIGT CACIATCCGA G3QGICCTAA GACMOCAGT 200 
QLSS RSV TIR GVLR QPV 

CACTACM^C TiOidCMC CACTAAGTIG TGACIQQQGA ALTl'l!ACICr 250 
TTY FSQP LSC DWG TLLF 

TITCACATOC TTTIUEAA3T ATOOCT^AA GCDOCACICC CTIGTTAGQG 300 
SHA F L I MPES PTP LLG 

A3AGACATIT TAQCAAAA3C AGQQQOCAIT AEACACX7D3A ACATAGGAAA 350 
RDIL AKA G A I IH L~ N I G K 

AQSAATAOCi: A3TIGCTGIC OOOIQCTIGA QGAAGGAAIT AAIXXTCASG 400 
GIP ICCP LLE EGI NPEV 

TCIGQQCAAT ASAAGGACAA TAIGGACAAG CAAAGAA3GC G03KXTOIT 450 
W A I EGQ YGQA KNA RPV 

CA2GTEAAAC TAAAGGATTC TGDOILLTIT OOCTAOCAAA QGAACTAOOC 500 
QVKL KDS ASF PYQR KYP 

TCITAGA03C GAGQOCCTAC AAG3ACICAA AAGAJTIGTEA AGGAOCT 547 
LRPEALQGLKRLLRT 
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FIG 46 



TCOOCAGCA GSACIGflGGS TOCOIZZEC AAGTGOMC OOKIGOCA3U 
G ARG KCQP MPS 

GOOCTOVGfC aXUDGTAT GTTIGAGCAT TGtGMXOG GAAGTTAACT 
PSE PRVC LTI ESQ EVNC 



O ' ldUT GCA CACIGGCECA GOCTICICKG TCTTACmC CK7DCQQGA 
L L D TCA AFSV LLS CPR 

OIA TIKHJU OCAGMCTGT QCTATOOGA QggTQCPC GJOGOCAGT 
QLSS RSV T I R GVLG QPV 

CPdPOK'UC TTCICTCAQC OCTMGTTG TCfCTOOQQV ACTTTACICT 
TTY FSQP LSC DWG T L L F 

TTItaCATGC TTTTCDATT A3GOCTCAAA 00XOCPX Cl'iUl'lMX 
SKA F L I MPES PTP LLG 



AGfiGXATTC TAGCAAttGC MGGOOOOT A33CAOCKA 
R D : L A K A G A I I H L N 



AGSUSAOQC ATTIGdGTC COCTQCnSA QGft&GGAATT AKPGCHGAAG 
GI? ICCP LLE EGI N P E V 

TCIGOEAAT AGAAGSCAA THTOSOAG CAAAGAKDX COTICCICTT 
WAX EGQ YGQA K N A RPV 

CAAGT1AAAC TAAflGGSTTC TOCCTOCTTT OOCTAOCAAA GGWGTOODC 
Q V X L KDS ASF P Y Q R K Y P 

TCnXAOCC GAGQOXTAC AAGGHOCA AA/GfflTGTT AW3CAOCTRA 
L R ? EALQGXQ KIV KDLK 

AAGOXAAGG OCTAGTAAAA CCATGCPGTA GOOCCIOCM UdCOWOT 
A Q G L V K PCSS PCN TPI 



TnOSGlAA OSAAAOCEAA OEAOGEG AGGTOGTX AAUU V iLAG 
L L G fiooA KPN G0W R I* V Q DLR 

GATDCTAAT OUEXOQT1T TICCmOTA CXXJUQCXGDk TCITCOCCTT 
I I N EAVF PLY PAV SSPY 

AXACICTGCr TTOaCTfiATA CDVGM3MG OfflGIGCIT TM3GTOCTG 
TLLSLI PEEAEWFTVL 

GACCrnXC OTGOCTTlTr CTGaaaET GTFCGT0CH3 ACKTOWTT 
D L X D AFF CIP VRPD SQF 

crrcrrraac tttgaaovtc cttigaaccc aaogtcioa ciumtauu* 

LFA FEDP LNP TSQ LTWT 

cromrAOC ceaagosttc taxxnecc oaovicnar "kxthaoxa 

VLPQG F ROSP H L F G Q A 

TISGaXAflG JCTDC3CTCA ATTCICMAC CTOGACMC nUlLX-TllA 
L A Q D LSQ FSY L D T L VLQ 

GTAOGIGGAT GATraOTT TPGTO3XCG TTCAGAAAO: T1LTGOJGC 
Y V D D L L L V A R SET LCHQ 
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AAGOOCCEA AGAACIOTA ACTTTOC1CA CBCCIGKE CEACAAQGTT 
ATQ ELL TFLT TCG YKV 

TCOWCOW A3GCTO3XT CEGCTCXAG GAATHGHT ACTEAGOCCT 
SKPK ARL CSQ EIRY LGL 



AAAKHATO: MAQ3CJCCA GSXDCTCAG TGJCGAM3T ATOCAGOCBV 1150 



E E R I Q P I 



KLS KGTR ALS 

TACUXCTTA ICCIUOLTJC AAAAOriAA AGOUCEWG /GDGnXTTT 
LAY P HP KTLK Q L R GFL 



GGOC^ACTG GlTlL'llL'GG 
G I T G F C R 



k Q I 



1200 
1250 



P R Y T P I A 



CAGAOCfflTA TATAOC1AA TBGOSWC TOCAAAGOC AAXMETMT 
RPL YTL2 RET QKA NTYL 

lAGIAAGAIG CZCACCT7CA GWGTGXTT T0OGO3XT AAK3W03X 
VRW TPT EVAF QAL KKA 



CTAAO0CAAG OOCCTGTGTT CNJJTIUULA ACAOSGGAAG ATITTTCnT 
LTQA PVF SLP TGQD FSL 

ATATOCCACA GMAAAAOVG GAATAGCICT AGOGTCCTP AOQOGGTCr 
YAT EKTG I A L G V L TQVS 

CAGGGKD3AG L'XTUMCEC GlUi'lMME TGAGTAAGGA AATKATGTA 
GMS LQP VVYL SKE IDV 

GTGOCAAAGG GH'UJLI. ' flL' A TIGTTCKnZj GMrOTOS OCnOOCT 
V A K G WPH CLW V M A A VAV 

CrTBGTKTCT GM.QCTGTTA AAAT AKQOt GOGAAGAGKr CTDVCTG1GT 
LVS EAVK ITQ GRD LTVW 

CEAOOCTCA TGYIGIGAAC CIXATECTCA. C1UTAAAGG AGA CT J OIGG 
TSH DVN GILT A K G D L W 

TTGTX3KAOV A0CA3TDCT WOK3G GCICTATBC TIGAflGAaCE 
LSDN HLL N Y Q ALLL SEP 

Ai'lUL'llAGA CSSGQCACTT GKCAACKT TAAflCCmx ACWirCITC 
VLR LRTC ATL Kpk TFLP 

CAGftCAOTGA AQUWAOCA GVCMMCT GTOftOtfGT AAnQCTOWV 
DNE E K I EHNC QQV I A Q 

AccDotrrc chx?g3ea GcrrcnoG cttdcctiga ctgkcccda 

TYAA RG0 LLE VPLT D P D 
, *-RN«ttH 

OCTCAMTTTG TOT ALU Wit. GMUT1UCR OCOGAAAAA GGAdTOOWV 
L N jL Y T D G S SL ABK GLRK 

MGCQQ07EA TOCAGTQCC AC2JATWGG GAMACXTGA AKTTATQOC 
AGY AVI SONG ILB S N R 

cracrocAG qacdosc tocogoca gakxaxzks oacrovcrrc 

LTPG TSA KLA E L I A L T W 



ALE LGEG K R V NIY SDSK 



AVIKlUaiA OCTAGiaTC OCTXUMG CKXAADOG GAGAOUSOG 
YAY LVL HAHA AIW RER 



GMCnGCTAA OTCIGMSG AKAGCDm: AKDWOOG AAGOCATDC 
EFLT SEC TPI NHQE AIR 



avGzmnriA tcgogiac agmacuda agagstqqoi ototacjct 

RLL L A V Q KPK EVA VLHC 



GCGGGGTCA TOU3CSAG\A GAGCAMGG3 AAXOGAAGG (3VAT0SOCAA 
Q G H QBE EERE I E G NRQ 
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ATCMCCAGC AGCAQGAQC AGQGIGOOOG GQ3CAAGOQC CT3C3C3CATOC 50 
MIQQ QDX GCP GQAP AHA 



CATCAOOCIC ACAGAGCDOC AQGTATOCTr GAOCATIGAG GGTCAGAAQG 100 
ITL TEPQ VCL TIE GQKG 



GINACIGICT CXTIDGACACT GQ03Qfl3CCT TCICACTCTT ACTITOCTCT 150 
XCL LDT GGAF SVL LSC 



OCIQGACAAC TG?TOCTOCAG AICIGICACT GIOOGAQQGG TOCTAGGACA. 200 
PGQL SSR SVT VRGV LGQ 



GCCAGICACT AGATACTICT OOCW30CACT AAGTIGIGAC TOQQGAACIT 250 
PVT RYFS QPL SCD WGTL 



TACicnra: AamxTrrr ctaattaigc cigaaaqooc caciuiutig 300 

LFP HAF LIMP ESP TLL 



TIG3GGAGAG ACATICEAGC AAAAGCAGGG GOCA3TATAC ATG7IGAATAT 350 
LGRD I L A KAG A I I H VNI 



A3GAGAAGGA ACAACIGITT GITCIGOOCT GCTIGAGGAA GGAATEAATC 400 
GEG TTVC CPL LEE GINP 



CIGAAGTOOG GGCAACAGAA GGACAATA3G GACAAGCAAA GAATOOGCGT 450 
EVR ATE GQYG QAK NAR 



C^iuriUAAG TEAAACEAAA GGATTOCADC TOCTTICOCT ADCAAAGQCA 500 
PVQV KLK DST SFPY QRQ 
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GmOOCCCIC AGAOOOaGk CXXAACAftGA ACTOCAAAAG ATTGTAAAGS 550 
YPL RPET QQE LQK IVKD 



AOCTAAAAGC OCAAGQCCTA CTAAAAOCAA GCAATAGCX3C TIQCAAGACT 600 
LKA QGL VKPS NSP CKT 



OCAATTTIAG GAGEAAQGAA ACD2AAOQ3A CAG?IQGft3G?r TAGIQCAAGA 650 
PILG VRK PNG QWRL VQE 



ACICAGGAIT ATCAAIGAGG CIGTIGTICC TCEATACXXA GCT3TA0CTA 700 
L R I INEA VVP LYP AVPN 



AOXTEATAC ACTQCTETCC CAAATACCAG A3GAAGCAGA GKCTTEACA 750 
PYT VLS QIPE EAE WFT 



GTOCT3GAOC 1TAAGGATOC CTITTICIGC ATOCTTCIAC GIOTIGACIC 800 
VLDL KDA FFC IPVR PDS 



TCAATICTIG TnXXXJETIG AA3ATOCTIT GAACCXAAQ3 TCICAACTCA 850 
QFL FAFE DPL NPT SQLT 



CUIGGACICT TTEACOOCAA GQC?ITCAQQG ATAQOOOQCA TCTA3TIGGC 900 
WTV LPQ GFRD SPH LFG 



CAGQCATIAG CCXMGACIT GftGICAATIC TCA3AOCTOG ACACICTIGT 950 
QALA QDL SQF SYLD TLV 



OCTICAGIAC ATOGATCAIT TftLTlTUAGT GQQOCCTICA GAAAOCTICT 1000 
LQY MDDL LLV A 'r S ETLC 
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GOCATCAAGC CAOGCAAGAA CTCTTAACTT TOCTCACEAC CIUIOGCTAC 1050 
HQA TQE LLTF LTT CGY 



aaggtitioca aaocaaaggc tcqqcicio: tcacaqgaga tiagatacin 1100 
kvsk pka rlc sqei ryx 



A3QQCEAAAA TEATCCAAAG GCADCAQQ3C CCTCAGIGAG GAACCTATCE 1150 
GLK LSKG T R A LSE ERIQ 



^XUKVEACT GQCTEA30CT CATCDCAAAA CXX7TAAAGCA ACEAAGAGQG 1200 
PIL AYP HPKT LKQ LRG 



TICCTIGQCA TAACAGGTIT CIGOOGAAAA CAGATTOOCA GCTACASCXX 1250 
FLGI TGF CRK QIPR YXP 



AATAQOCAGA (XATEATATA CACEAATEAN GGAAACICAG AAAGOCAATA 1300 
I A R PLYT L I X ETQ KANT 



GCEATFEAGT AAGATGGACA OCEACAGAAG TOQLTl'lUCA QQODCTAAAG 1350 
YLV RWT PTEV AFQ ALK 



AAGGOCCTAA COCAAQOGOC AGIGTICAGC TIGOCAACAG G3CAAGATTT 1400 
KALT QAP VFS LPTG QDF 



TIUITEATAT GOCACAGAAA AAACAGGAAT A3CICTAGGA GTOCTIAOQC 1450 
SLY ATEK TGI ALG VLTQ 



AGGTCTCAGG GATGAGCTTG CAADOOGIGG TATADCTGAG TAAGGAAATT 1500 
VSG MSL QP VV YLS KEI 
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GATOEACTGG CAAAQQGTIG GCXTTCAINSr TTATOQGEAA TOC3\GGCAGT 1550 
D V V A KGW PHX LWVM XAV 



A3CAGICINA. GIATCIGAAG OGTEAAAAT AATACAGQGA. AGAGATCTIN 1600 
AVX VSEA VKI IQG RDLX 



CI01Q1QGRC AICICOIGAT GIGAAOQGCA TACISRCIGC TAAAGGAGAC 1650 
VWT SHD VNGI L X A KGD 



TTOIGGnCT CAGACAAOCA TTTACTEAAN TOTCAGGCYY TATTACTIGA 1700 
LWLS DNH LLX YQAL LLE 



AGAGOCAGIG GCALT1UIUC AACICTEAAA CCKAAACTIA 1750 

EPV LXLR TCP TLK PKLM 



TQCTOOOCAG AAG3ATCTIT NEAGAGGIDC OCTEAGCCAA OOCIGACCIC 1800 
LPR RIF XEVP LAN PDL 



AACHAEAEAT AEACTGA3GG AACTIOGITT GEAGAAAAGG GATEACAAAG 1850 
NYIY TDG SSF VEKG LQR 



GC^IAGGATAT NOCAEAGGIG TEAGIGATAA A3CAGIACTT GAAAGTAAGC 1900 
XGY XIGV SDK A V L ESK P 



CICTIDOOC3C OGAGGGADCA QOQOOOOCCT TAGCAGAACT AGIG3CACIG 1950 
LPP QGP APPL AEL VAL 



AXC030GAG CCTEAGAACT TIGGAAAGQG AQGAGGAEAA AIGIGEAIEAC 2000 
TPRA LEL WKG RRIN VYT 



5/9/2006, EAST Version: 2.0.3.0 



WO 98/23755 



59/69 



PCT/IB97/01482 



FIG 48E 



10 20 30 40 50 

1234567890 1234567890 1234567890 1234567890 1234567890 

AGATAGCAAG TATOCTTATC TAATCOGAAA TOG0CAIG3T GCAATMOGA 2050 
DSK YAYL IRN A H V AIWK 



AAGAAAQQGA GTIOCTAAOC TCIGQQQGAA CXXOCATIAA AIAOCACAAG 2100 
ERE FLT SGGT PIK YHK 



TEAAICA1QG AGTEATTOCA CACAGIGCAA AAACICAAGG AGGIGGAAGT 2150 
LIME LLH TVQ KLKE VEV 



CTEACACIGC CAAAQOCATC AGAAAAG3GA AAGAG3GGAA GAGCAQCAEA. 2200 
LHC QSHQ KRE RGE EQHK 



AGIQ3CTACA GAGQCAAGGA AAGACTA3CA GAAAGGAAAG AGAGAAAGAG 2250 
WLQ RQG KTSR KER EKE 



ACAGAAAGIC AGAGAGAGAG AGAGGAAGAG ACAGAGCACA AAGAGQGAGT 2300 
TESQ RER EEE TEHK EGV 



CAGAGAGAGA GAGAGACAGA GAGICAGAGA GAAGGAAAGA GAGAGAG3AA 2350 
RER ERQR VRE KER ERG ; R 



GAGA2AAAGA AIGA 2364 
D K E . 
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FIG 49A 



Complement of 8/46-7 propre 
1 /46-7 propre 

Complement of cl5 propre 46-7 
Consensus 



3ACTTGAGCC AGTCCTCATA CCTGGACATT CTTGTTC I 



SACTTGAGCC AGTCCTCATA CCTGGACATT CTTGTTC I 



AGTAT j 3GGA 

3ACTTGAGCC AGTCCTCATA CCTGGACATT CTTGTTCUlC AGTAT ] 3GGA 
^r^m^nm ft wYvnrwt rvTYy.ftnaTT rnTOT^rUtr* arrrMttaoGA 



AGTAT^QGGA 



50 
50 
50 

50 



Complement of 8/46-7 propre 
1 /46-7 propre 

Complement of cl5 propre 46-7 
Consensus 



KSAhTTAATT ATAGCCACCC ATTCAGAAAC CTTC3TGGCA V 
IGA : TEAATT ATAGCCACCC ATTCAGAAAC CTTGTQGCA T 
TGA I TTAATT ATAGCCACCC ATTCAGAAAC CTTGTQGCA 2 

irrravm* m^mr aT»mar;a&an rrrryrYysral hAftfYTAfTTl 



bs&J 



2AAGCCACCC 
^AGCCACCC 
ZAAGCCACCC 



100 
100 
100 

100 



Complement of 8/46-7 propre 
1 /46-7 propre 

Complement of cl5 propre 46-7 



Consensus 



AAG I3CTCTT 
AAG : GCTCTT 
AAG I GCTCTT 



AAATTTCCT 2 
AAATTTCCT T 
AAA TTTCCT 2 



3CTAOCTGTG GCTCCAAACA AA- 
3CTACCTOIG GCTCCAAACA AAft 
3CTACCTGTG GCTCCAAACA AA i 



**Mv*ir*TH* aa&M»i»mriV timmrr^rrm ryrrr'aa&ra aa 



150 
150 
150 

150 



Complement of 8/46-7 propre 
1 /46-7 propre 

Complement of cl5 propre 46-7 
Consensus 

Complement of 8/46-7 propre 
1 /46-7 propre 

Complement of cl5 propre 46-7 
Consensus 

Complement of 8/46-7 propre 
1 /46-7 propre 

Complement of cl5 propre 46-7 
Consensus 



CTCTGCTCAC A * CAGGTTAA ATACTTAGGG CTAAAATTAT 
CTCTGCTCAC A 1 CAGGTTAA ATACTTAGGG CTAAAATTAT 
CTCTGCTCAC A' CAGGTTAA ATACTTAGGG CTAAAATTAT 



ccaaagtc : : 

CCAAAGTC P Z 
CCAAAGTC A Z 



nnr~nym*n aHnamrraa AVhrTTAonrz PTaaaaTTaT rY^aaanrrcrfc 



CAGGGCCCTC AGAGAGGAAC GTATCCAGCG TATACTGG s T 
CAGGGCCCTC AGAGAGGAAC GTATCCAGCG TATACTGG : T 
CAGGGCCCTC AGAGAGGAAC GTATCCAGCG TATACTGG ". T 
^rryyvynr arcana ma an irranrvarra T*T&nrrjii» TaTTrrMranri 



TATCCICATC 



TATCCICATC 
TATCC:CATC 



cca e aaccst 
ccaiaacc:t 



AAAGCAACTA 
AAAGCAACTA 
AAAGCAACTA 



AGA 7 3GTTCC 



aga:ggt 



ccapaacc:t 

^JJaarvUr aaararaapra Afsal 



AGA ? 3GTTCC 



TTGGCATABC 
TCC TIGGCATApfZ 
TTGGCA' 



ATA 5 C 



AGCCTTCTGC 
AGCCTTCTGC 
AGCCTTCTGC 



200 
200 
200 

200 

250 
250 
250 

250 



300 
300 
300 
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Complement of 8/46-7 propre 
1 /46-7 propre 

Complement of cl5 propre 46-7 
Consensus 

Complement of 8/46-7 propre 
1 /46-7 propre 

Complement of cl5 propre 46-7 
Consensus 



CGAATATGGA TTCCC : 3ATA 
CGAATATGGA TTCCC : 3ATA 
CGAATATGGA TTCCC c 3ATA 



CAGI3AAATA 
CAC E 3AAATA 
CAG:3AAATA 



GCCAGGCCAT TATGTACATT 
GCCAGGCCAT TATGTACATT 
GCCAGGCCAT TATGTACATT 



-viaaTaTra a aZPGQCfcCATfl ra/AfcaaaTO rmrimmn* TaTOraraTg 



fihtflAAGGAA ACTCAGAAAG CCAATACCCA TATAGTAAGA TGGACACCTG 
A \1 IAAGGAA ACTCAGAAAG CCAATACCCA TATAGTAAGA TGGACACCTG 
fire TAAGGAA ACTCAGAAAG CCAATACCCA TATAGTAAGA TGGACACCTG 
ahJraaraaa nnra^A^ rramrrrt TOTOrym&r.t Trrearaonr: 



350 
350 
350 

350 



400 
400 
400 

400 



Complement of 8/46-7 propre 
1 /46-7 propre 

Complement of cl5 propre 46-7 
Consensus 



a:acagaagt 
a 5 acagaagt 
agacagaagt 



l&RACAGAAGX. 



GGCTTTCCAG GCCCTAAAG 
GGCTTTCCAG GCCCTAAAG 
GGCTTTCCAG GCCCTAAAG 
#.y.y«m'iv<rar3 rrvrraaar: 



429 
429 
429 

429 
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Trans of 1 Z46-7 pr 
'Trans of Carcplement-2 
Trans of Conplement^ 

Consensus 



Trans of 1 /46-7 pr 
Trans of Complement-2 
Trans of Complement 



DLSQSSYLDI LV[ 
DLSQSSYLDI LVC 
DLSQSSYLDI LV E 

DLSQSS2£LDI_Ui : 



:ddli xathsgtlwh 
fddli iathsetlwh 

:DDLI IATHSETLWH 



QA^fejLLNFL ATCGSKQjjlfe 
QATC . LLNFL ATCGSKQ : R 2 
QATC ! LLNFL ATCGSKQ<A3 



■pVKYLG LKLSKVIRAL 
JQVKYLG LKLSKVIRAL 
fcVKYLG LKLSKVIRAL 

mnrnn T,KT, < ?TnmmflT l 



RKERIQRII \ 
REERIQRH D 
REERIQKXI \ 

RFRRTQffllR 



Hi 



KQL R 
R 



PfQTi S 



mm 
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100 
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Trans of 1 /46-7 pr 
Trans of Complement-2 
Trans of Complement 

Consensus 



RIWXE : YSEI 
RlWIPf YSEI 
YSEI 



ARPLCTLIKE 



ARPLCTX y KE 
ARPLCTL :KE 



ZSEX-ABELCEL 



TQKANTHIVR WTPETEVAFQ ALK 
TQKANTHIVR WTPETEVAFQ ALK 
TQKANTHIVR WTPETEVAFQ ALK 



143 
143 
143 

143 



FIG 50 6 



Trans of c!43 propr 
Trans of 42/68-1 pr 
Trans of 41/68-1 pr 

Consensus 

Trans of cl43 propr 
Trans of 42/68-1 pr 
Trans of 41/68-1 pr 

Consensus 

Trans of cl43 propr 
Trans of 42/68-1 pr 
Trans of 41/68-1 pr 

Consensus 



pLSQSSYLE^C 

r 
r 

r 



LVLRYMDDLL LATHSETLCH QATQALLNFL ATCGYKVSKP 
LVLRYMDDLL LATHSETLCH QATQALLNFL ATCGYKVSKT 
LVLRYMDDLL LATHSETLCH QATQALLNFL ATCGYKVSKP 
r.VT.RYMnriT.T, T.RTHSFTTrH OftTnftT.T.MFT. a^vmretrr 



KAQLCSQQVK YLGLKLSKGT RTLSEERIQP ILGYPHPKTL KQLTAFLGn 
KAQLCSQQVK YLGLKLSKGT RTLSEERIQP ILGYPHPKTL KQLTAFLGTT 
KAQLCSQQVK YLGLKLSKGT RTLSEERIQP ILGYPHPKTL KQLTAFLGIT 
gRnT/XYYTW VT/STJrr.craT vm.^r^j^ TT/.vpm n wn. ^r TOt?Trrr 



3FCQIWIPRY SK 
SFCQIWIPRY SK 
GFCQIWTPKY SK 


ARPLNTR IKETQKA 
ARPLNTR IKETQKA 
ARPLNTR IKETQKA 


IH LVRWT 
IH LVRWT 
IH LVRWI 

TH T.WfiTT 


EAEV AFQALK 
EAEV AFQALK 
EAEV AFQALK 

FAEV ftFQAT.K 
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50 
50 

50 



100 
100 
100 

100 
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FIG 50A 



41/68-1 propre 
cl43 propre 68-1 
42/68-1 propre 

Consensus 



3ACTTGAGCC AGTC : ICATA CCTGGACA \ T CTTSTCCTTC GGTACAT9GA 
3ACTTOAGCC AGTC i ICATA CCTGGACA i T LTlUlUJl ' Ai: GGTACATGGA 
3ACTTGAGCC AGTC : ICATA CCTGGACA : T CTTGICCTTC GGTACATGGA 



fyrrnrftTOca 



50 
50 
50 

50 



41/68-1 propre 
cl43 propre 68-1 
42/68-1 propre 

Consensus 



XG&TXTACTT TTAGCCACCC ATTCAGAAAC CTTGTGC CAT CAAGCCACCC 
IGATTTACTT TTAGCCACCC ATTCAGAAAC CTrGTCCCAT CAAGCCACCC 
1GATTTACTT TTAGCCACCC ATTCAGAAAC CTTGTGCCAT CAAGCCACCC 

TTSflTTTTR PTT TTAfYrAfrr ATTTAfiAAAr PTTTTTTYTftT mRfyTAmT 



100 
100 
100 

100 



41/68-1 propre 
cl43 propre 68-1 
42/68-1 propre 

Consensus 



AAGCACTCTT AAATTTCCTT GCIA CCTGTC GCTACAAGGT TTCC AAACCA 
AAGCACTCTT AAATTTCCTT GCTAOCTGTG GCTACAAGGT TTCCAAACCA 
AAGCACTCTT AAATTTCCTT GCTAOCTGTG GCTACAAGGT TTCCAAACCA 
nar.r&Mi^i»i' hrhth'h y *im' arTArcvtrm ryrArAnryyr «i'nrY'&&anr'R 



150 
150 
150 

150 



41/68-1 propre 
cl43 propre 68-1 
42/68-1 propre 

Consensus 



AAGGCTCAGC TCTGCTCACA GCAGGTTAAA TACTTAGGGC TAAAATTATC 
AAGGCTCAGC TCTGCTC A CA GCAGGTTAAA TACTTAGGGC TAAAATTATC 
AAGGCTCAGC TCTGCTC A CA GCAGGTTAAA TACTTAGGGC TAAAATTATC 
aarysmvarar *nmmrmn&^& nninrrm** >rarn*ror5ryar «ra&at»r»TO»tr 



200 
200 
200 

200 



41/68-1 propre 
cl43 propre 68-1 
42/68-1 propre 

Consensus 



CAAAGGCACC AGAACCCTCA GTGAGGAACG TATCCAGCCT ATACTGGGTT 
CAAAGGCACC AGAACCCTCA GTGAGGAACG TATCCAGCCT ATACTGGGTT 
CAAAGGCACC AGAACCCTCA GTGAGGAACG TATCCAGCCT ATACTGGGTI 
^»aryy»fv *ri*'hnrr*rin* rrrrzimfz^ryz wrminrr*n *T*r*mrxxm 



250 
250 
250 

250 



41/68-1 propre 
cl43 propre 68-1 
42/68-1 propre 



ATCCTCATCC CAAAACCCTA AAGCAACTAA C AGOGTTCCT TX3GCATAACA 
ATCCTCATCC CAAAACCCTA AAGCAACTAA CAGO GTTC CT TGGCATAACA 
ATCCTCATCC CAAAACCCTA AAGCAACTAA CAGOGTTCCT TGGCATAACA 
&m« * «n 'butt* paaMrrrra _&&nrR*r-T&& CAGCgCPCgC irarvraara 



300 
300 
300 

300 



41/68*1 propre 
cl43 propre 68-1 
42/68-1 propre 

Consensus 



3GTTTCTGCC AAATATGGAT TCCCAGGTAC AGCAA 3AIAG 



3GTTTCTGOC AAATATGGAT TCCCAGGTAC AGCAAMXAG 

AGCAA\C EAG 



3GTTTCTGCC AAATATGGAT TCCCAGGTAC 



(XAGACCATT 
CCAGACCATT 
CCAGACCATT 



350 
350 
350 

350 



41/68-1 propre 
cl43 propre 68-1 
42/68-1 propre 

Consensus 



AAATACACGA ATTAAGGAAA CTCAAAAAGC 
AAATACACGA ATTAAGGAAA CTCAAAAAGC 
AAATACACGA ATTAAGGAAA CTCAAAAAGC 

AAATArAPGR ATTABmaAA mTAAAAACC 



CA P rACCCAT TTAGTAAGAT 
CA B rACCCAT TTAGTAAGAT 
CA " rACCCAT TTAGTAAGAT 

gMTTAfTTAT TTAfrTBAGAT 



400 
400 
400 

400 



41/68-1 propre 
cl43 propre 68-1 
42/68-1 propre 

Consensus 



AGCAGAAGTG GCTTICCAGG CCCTAAAG 
AGCAGAAGTG GCTTTCCAGG CCCTAAAG 
2TGA AGCAGAAGTG GCTTTCCAGG CCCTAAAG 
Wrea a ry area » rare -i-,-it t mrwmr. 
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MSRV pol 

cons ADN 1,5,8 

Consensus 



MSRV pol 

cons ADN 1.5,8 

Consensus 



MSRV pol 

cons ACN 1,5,8 

Consensus 



MSRV pol 

cans ADN 1,5,8 



MSRV pol 

cons ADM 1,5,8 

Consensus 



MSRV pol 

cans ADN 1,5,8 

Consensus 



MSRV pol 

ADN 1,5,8 



Consensus 



MSRV pol 

cons ADN 1,5,8 



MSRV pol 

cans ADN 1,5,8 



MSRV pol 

cans ADN 1,5,8 



MSRV pol 

cons ADN 1,5,8 



MSRV pol 

cons ADN 1,5,8 



ATTATGCCTG AAAGCCCCAC TCCCTTGTTA GQGAGAGACA TTTTAGCAAA 
ATTATGCCTG AAAGCCCCAC TCCCTTGTTA GGGAGAGACA TITTAGCAAA 
AGCAGGGGCC ATTATACACC TGAACATAGG AAAAGGAATA CCCATTTGCT 
AGCAGGOGCC ATTATACACC TGAACAXAGG AAAAGGAATA CCCATTTGCT 
GTCCCCTGCT TGAGGAAGGA ATTAATCCTG AAGTCTGGGC AATAGAAGGA 



GTCCCCTGCT TGAGGAAGGA ATTAATCCTG AAGTCTGGGC AATAGAAGGA 
CAATATGGAC AAGCAAAGAA TGCCCGTCCT GTTCAAGTTA AACTAAAGGA 
CAATATGGAC AAGCAAAGAA TGCCCGTCCT GTTCAAGTTA AACTAAAGGA 
TTCTGCCTCC TTTCCCTACC AAAGGAAGTA CCCTCTTAGA CCCGAGGCCC 
TTCTGCCTCC TTTCCCTACC AAAGGAAGTA CCCTCTTAGA CCCGAGGCCC 
TACAAGGANC TCAAAAGATT GTTAAGGACC TAAAAGCCCA AGGCCTAGTA 



TACAAGGANC TCAAAAGATT GTTAAGGACC TAAAAGCCCA AGGCCTAGTA 
AAACCATGCA GTAGCCCCTG CAATACTCCA ATTTTAGGAG TAAGGAAACC 
AAACCATGCA GTAGCCCCTG CAATACTCCA ATTTTAGGAG TAAGGAAACC 
CAACGGACAG TGGAGGTTAG TGCAAGATCT CAGGATTATT AATGAGGCTG 
CAACGGACAG TGGAGGTTAG TGCAAGATCT CAGGATTATT AATGAGGCTG 
TOTlUriXJl 1 ATACCCAGCT GTATCTAGCC CTTATACTCT GCTTTCCCTA 



TmTlirTCT ATACCCAGCT GTATCTAGCC CTTATACTCT GCTTTCCCTA 
ATACCAGAGG AAGCAGAGTG GTTTACAGTC CTGGACCTTA AGGATGCCTT 



AXACCAGAGG AAGCAGAGTG GTTTACAGTC CTGGACCTTA AGGATGCCTT 
TTTCTGCATC ( X TO IAOm: CTGACTCTCA ATTCTTGTTT GCCTTTGAAG 



TTTCTGCATC CCTGTACGTC CTGACTCTCA ATTCTTGTTT GCCTTTGAAG 
ATCCTTTGAA CCCAACGTCT CAACTCACCT GGACTGTTTT ACCCCAAGGG 
ATCCTTTGAA CCCAACGTCT CAACTCACCT GGACTGTTTT ACCCCAAGGG 
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MSRV pol TTCAGGGATA CgCCCCATCT ATTTGGCCAG GCATTAGCCC AJ 3A CTTCA G 
cons ADN 1,5,8 3ACTTGAG 

Consensus TTCAGGGATA GCCCCCATCT ATTTGGCCAG GCATTAGCCC AH3ACHBBC 



650 
8 

650 



MSRV pol 

cans ADN 1,5.8 

Consensus 




MSRV pol 

cons ADN 1,5,8 

Consensus 



cc:ttcagaa accttgtg:c ai^aagccac ccaa 



CC - TTCAGAA ACCTTGTG ] Z A i 2AAGCCAC CCAA 



camrar.aa arr 



750 
108 

750 



MSRV pol 
cans ADN 1,5,£ 



itaa : nrcc i^ctacctg tqgchcaag gtts xaaac 

PTAA ' TTTCC T it CTACCTG TGGCT — — — CCAAAC 

Traftkfrwnnn TtnJm*n t t -n"i TQGC&CAAG (JIT 1 1 1 jAAACL 



800 
149 

800 



5/9/2006, EAST Version: 2.0.3.0 



WO 98/23755 



FIG 51 A 



PCT/IB97/01482 



MSKV pol 

cons ADN 1,5,8 

Consensus 



dcTCTCSCTCA C^pdAGfcfTTA fiATACTTAGG GCTAAAATTA 
SCTCTGCTCA CA 3C AG * TCA : ATACTTAGG GCTAAAATTA 



< l «»-«Tr jr Ty& r»&k dar HT*TO Uvra^Trranr; rarraaaRTTA TTY^AA 
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MSKV pol 

cons ADN 1,5, 8 

Consensus 



MSKV pol 

cons ADN 1,5,8 

Consensus 



MSKV pol 

cons Am 1,5,8 



MSKV pol 

cons ADN 1,5,8 



^CAGGGCCCT^CAG]i1gAGGAA CGTATCCAGC flEATACTQGp pTAlCCllEAT 
CCAGGGCCCT CAG fi 3AGGAA CGTATCCAGC " rATACTGG * ITATCC « CAT 

mrywre v , ™« Jrwyr» rvyrafnnnar!/ " ^raTAgmck bmTnrMr&T 




rAAAGCAACT AAGA^GTTC CTTGGCATAlV E 
XAAAGCAACT AAGA ? 3GTTC CTTGGCATA p 
Tttfiinr***™ ^niAinfyT^n rTTrarftTato 




3 ATTCCC! 
S ATTCOC « 


:r aca 
= r ACA 

=r_ACA 


^CCC 
3YG* 


AAT AGCCAG 
AAT AGCCAG 


CCA TTAI 
CCA TTAT 


iaca: 
eaca r 



^VlfTAtfeGA AACTCAGAAA GCCAAXACdr ESttAGTAAG ATGGACACCT 
sfct SGA AACTCAGAAA GCCAATAOC Z jATprAGTAAG ATGGACACCT 
r ftinaUTn& **rvr*rm? nnnnimrA/ tATWrAfTTAAfl &3PGAC&CCI 
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MSKV pol 

cons ADN 1,5,8 

Consensus 



MSKV pol 

cons ADN 1,5,8 

Consensus 



MSKV pol 

cons ADN 1,5,8 



MSKV pol 

cons ADN 1,5,8 



MSKV pol 

cons ADN 1,5,8 



MSKV pol 

cons ADN 1,5,8 

Consensus 



MSKV pol 

cons ADN 1,5,8 



MSKV pol 

cons ADN 1,5,8 

Consensus 



MSKV pol 

cons ADN 1,5,8 



MSKV pol 

cons ADN 1,5,8 

Consensus 



MSKV pol 

cons ADN 1,5,8 

Consensus 



ACAGAAG TGGCTTTCCA GGCCCTAAAG RAGGCOCTAA COCAAGCCOC 

GAF ACAGAAG T GGCTTTCCA GGCCCTAAAG 

/^ pk^nrmrs i mnnn* fyyw^ra&np- karryrY^TAJi CCCAAGCCCC 



1097 
429 

1100 



AGTGTTCAGC TTGCCAACAG GGCAAGATTT TTCTTTATAT GCCACAGAAA 1147 
429 

A GTGTTC AGC TTGCCAACAG GGCAAGATTT TTCTTTATAT GCCACAGAAA 



1150 



AAACAGGAAT AGCTCTAGGA GTCCTTACGC AGGTCTCAGG GATGAGCTTG 

AAACAGGAAT AGCTCTAGGA GTCCTTACGC AGGTCTCAGG GATGAGCTTG 

CAACCCGTGG TATACCTGAG TAAGGAAATT GATGTAGTGG CAAAGGGTTG 

CAACCOGTGG TATACCTGAG TAAGGAAATT GATGTAGTGG CAAAGGGTTG 

GCCICATTGT TTATGGGTAA TGGCGGCAGT AGCAGTCTTA GTATCTGAAG 

GCCTCATTGT TTATGGGTAA TGGCGGCAGT AGCAGTCTTA GTATCTGAAG 

CAGTTAAAAT AATACAGGGA AGAGATCTTA CTGTGTGGAC ATCTCATGAT 

CAGTTAAAAT AATACAGGGA AGAGATCTTA CTGTGTGGAC ATCTCATGAT 1350 

GTGAACGGCA TACTCACTGC TAAAGGAGAC TTGTGGTTGT CAGACAAOCA 1397 
429 

GTGAACGGCA TACTCACTGC TAAAGGAGAC TTGTGGTTGT CAGACAACCA 1400 

TTTACTTAAT TATCAGGCTC TATTACTTGA AGAGCCAGTG CTGAGACTGC 1447 

TTTACTTAAT TATCAGGCTC TATTACTTGA AGAGCCAGTG CTGAGACTGC 1450 
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